Learning to Teach
Object tracking has been a hot topic in recent years. It involves localization of an object in continuous video frames given an initial annotation in the first frame. The process normally consists of the following steps.
- Taking an initial set of object detections.
- Creating and assigning a unique ID for each of the initial detections.
- Tracking those objects as they move around in the video frames, maintaining the assignment of unique IDs.
- Passive tracking
- Active tracking
Passive tracking assumes that the object of interest is always in the image scene, meaning that there is no need for camera control during tracking. Although passive tracking is very useful and well-researched with existing works, it is not applicable in situations like tracking performed by a camera-mounted mobile robot or by a drone. On the other hand, active tracking involves two subtasks, including 1) Object Tracking and 2) Camera Control. It is difficult to jointly tune the pipeline between these two separate subtasks. Object Tracking may require human efforts for bounding box labeling. In addition, Camera Control is non-trivial, which can lead to many expensive trial-and-errors in the real world.