- YOLO CPU Running Time Reduction: Basic Knowledge and Strategies
- Build Personal Deep Learning Rig: GTX 1080 + Ubuntu 16.04 + CUDA 8.0RC + CuDnn 7 + Tensorflow/Mxnet/Caffe/Darknet
- Recurrent YOLO for Object Tracking [Project Page][Arxiv][Github]
- SSD in MxNet with C++ test modules [Github], by my roomie [Zhi Zhang]
- LightTrack: Online Human Pose Tracking [Project Page][Arxiv][Github]
YOLO, short for You Only Look Once, is a real-time object recognition algorithm proposed in paper You Only Look Once: Unified, Real-Time Object Detection , by Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi.
As was discussed in my previous post (in Chinese), the Jetson TX1 from NVIDIA is a boost to the application of deep learning on mobile devices and embedded systems. Many potentially inspiring products are approaching, one of which, to name with, is the real-time realization of computer vision tasks on mobile devices. Imagine the real-time abnormal action recognition under surveillance cameras, the real-time scene text recognition by smart glasses, or the real-time object recognition by smart vehicles or robots. Not excited? How about this, the real-time computer vision tasks on egocentric videos, or on your AR and even VR devices. Imagine you watch a clip of video shot by Kespry (What is this?) , you experience how Messi beat less than a dozen players and scored a goal. This can be used for educational purposes, where you stand in a player’s shoes, study how he/she observes the real-time circumstances and handles the ball. (If you are considering a patent, please put my name to the end of the inventors list.)
That being said, I assume you have at least some interest of this post. It has been illustrated by the author how to quickly run the code, while this article is about how to immediately start training YOLO with our own data and object classes, in order to apply object recognition to some specific real-world problems.
Here are two DEMOS of YOLO trained with customized classes: