trackingpeoplebottom - Finding and Tracking People from the...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Finding and Tracking People from the Bottom Up Deva Ramanan and D. A. Forsyth Computer Science Division University of California, Berkeley Berkeley, CA 94720 [email protected], [email protected] Abstract We describe a tracker that can track moving people in long sequences without manual initialization. Moving peo- ple are modeled with the assumption that, while configura- tion can vary quite substantially from frame to frame, ap- pearance does not. This leads to an algorithm that firstly builds a model of the appearance of the body of each indi- vidual by clustering candidate body segments, and then uses this model to find all individuals in each frame. Unusually, the tracker does not rely on a model of human dynamics to identify possible instances of people; such models are un- reliable, because human motion is fast and large accelera- tions are common. We show our tracking algorithm can be interpreted as a loopy inference procedure on an underlying Bayes net. Experiments on video of real scenes demonstrate that this tracker can (a) count distinct individuals; (b) iden- tify and track them; (c) recover when it loses track, for ex- ample, if individuals are occluded or briefly leave the view; (d) identify the configuration of the body largely correctly; and (e) is not dependent on particular models of human mo- tion. 1. Introduction A practical person tracker should: track accurately for long sequences; self-start; track independent of activity; be robust to drift; track multiple people; track through brief occlusions; and be computationally efficient. It should also avoid background subtraction; we want to track people who happen to stand still on backgrounds that happen to move. The literature on human tracking is too large to review in detail. Tracking people is difficult, because people can move very fast. One can use the configuration in the current frame and a dynamic model to predict the next configura- tion; these predictions can then be refined using image data (see, for example, [9, 13, 3]). Particle filtering uses multi- ple predictions — obtained by running samples of the prior through a model of the dynamics — which are refined by comparing them with the local image data (the likelihood) (see, for example [14, 3]). The prior is typically quite dif- fuse (because motion can be fast) but the likelihood function may be very peaky, containing multiple local maxima which are hard to account for in detail. For example, if an arm swings past an “arm-like” pole, the correct local maximum must be found to prevent the track from drifting. Anneal- ing the particle filter is one way to attack this difficulty [6]. An alternative is to apply a strong model of dynamics [14], at the considerable cost of needing to choose the motion model before one can detect or track people. An attractive alternative is to ignore dynamics and find people in each frame independently, using such cues as local motion [15] or appearance [11]. As far as we know, no current person tracker meets all
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/13/2011 for the course CAP 6412 taught by Professor Staff during the Spring '08 term at University of Central Florida.

Page1 / 8

trackingpeoplebottom - Finding and Tracking People from the...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online