|Saliency based Discriminant Tracking|
Object tracking is a pre-requisite for important applications of computer vision, such as surveillance, activity or behavior recognition. Many years of research on the tracking problem have produced a diverse set of approaches and a rich collection of tracking algorithms. A popular subset among these are the so-called appearance based methods, which learn and maintain a model of target appearance and use it to locate the target as time evolves. For instance, targets can be represented by their contours, and the temporal evolution of these contours modeled with particle filters (e.g.the condensation algorithm). Alternatively, target appearance can be represented by kernel weighted histograms, which are popular in the context of mean shift algorithms.
All of these methods rely uniquely on models of object appearance and do not take the background into account. This limits tracking accuracy when backgrounds are cluttered, or targets have substantial amounts of geometric deformation, such as out-of-plane rotation. To address this limitation, various authors have proposed the formulation of discriminant tracking - object tracking as continuous object detection, by posing the problem as one of incremental "target vs. background" classification (e.g.ensemble tracking). Given a target bounding box at video frame t, a classifier is trained to distinguish target features from those of the background. This classifier is then used to determine the location of the target in frame t+1. The bounding box is moved to this location, the classifier updated, and the process iterated.
We propose a biologically inspired framework for discriminant tracking based on discriminant center surround saliency. At each frame, discrimination of the target from the background is posed as a binary classification problem. From a pool of feature descriptors for the target and background, a subset that is most informative for classification between the two is selected using the principle of maximum marginal diversity. Using these features, the location of the target in the next frame is identified using top-down saliency, completing one iteration of the tracking algorithm. We also show that a simple extension of the framework to include motion features in a bottom-up saliency mode can robustly identify salient moving objects and automatically initialize the tracker. The connections of the proposed method to existing works on discriminant tracking are discussed. Experimental results comparing the proposed method to the state of the art in tracking are presented, showing improved performance.
V. Mahadevan,and N. Vasconcelos.
In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
Miami, FL, 2009. © IEEE [ps] [pdf]
|Contact:||Vijay Mahadevan, Nuno Vasconcelos|