Deep learning is an approach based on multilayer neural networks for learning a hierarchical set of features from the training data before the model optimization step.
Training of Video Data
A hybrid CPU/GPU approach has proven more efficient here than a GPU-heavy approach: At first, training was done by simultaneously feeding multiple frames from the video into the network. This requires a large (expensive) network. It was then seen that similar results can be achieved by extracting the optical flow data from the frames in advance (with a CPU-bound process) and feed the frames plus the flow data sequentially to the network (Large-scale Video Classification with Convolutional Neural Networks and the Sports 1M data set).
Consider converting the color space of the input frames from RGB to YUV in order to condition the network to a color perception model that is closer to human vision.
- Torch by Facebook, Google, Twitter
- Caffe by UC Berkeley
- Theano with Pylearn2, by the University of Montréal
- Computational Network Toolkit by Microsoft
- TensorFlow by Google
- Neon by Nervana Systems
- Nvidia cuDNN