This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Actions As Objects: A Novel Action Representation Alper Yilmaz Mubarak Shah University of Central Florida Orlando, FL-32828, USA Abstract In this paper, we propose to model an action based on both the shape and the motion of the object performing the action. When the object performs an action in 3D, the points on the outer boundary of the object are projected as 2D ( x, y ) con- tour in the image plane. A sequence of such 2D contours with respect to time generates a spatiotemporal volume (STV) in ( x, y, t ) , which can be treated as 3D object in the ( x, y, t ) space. We analyze STV by using the differential geometric surface properties, such as peaks, pits, valleys and ridges, which are important action descriptors capturing both spa- tial and temporal properties. A set of motion descriptors for a given is called an action sketch . The action descriptors are related to various types of motions and object deforma- tions. The first step in our approach is to generate STV by solving the point correspondence problem between consec- utive frames. The correspondences are determined using a two-step graph theoretical approach. After the STV is gen- erated, actions descriptors are computed by analyzing the differential geometric properties of STV. Finally, using these descriptors, we perform action recognition, which is also for- mulated as graph theoretical problem. Several experimental results are presented to demonstrate our approach. 1 Introduction Recognizing human actions and events from video sequences is very active in Computer Vision. During the last few years, several different approaches have been proposed for detection, representation and recognition, and understanding video events. Some popular approaches for action recogni- tion include Hidden Markov Models , Finite State Ma- chines , neural networks and Context Free Grammars. The important question in action recognition is which features should be used? Therefore, the first step in action recognition is to extract useful information from raw video data to be em- ployed in different recognition models. A common approach for extracting relevant information from video is visual track- ing. Tracking can be performed by using only a single point on the object. Single point tracking generates a motion tra- jectory, and there are several approaches employing motion trajectories for action recognition . It is common to use changes in speed, direction, or maxima in the spatio-temporal curvature of a trajectory to represent important events in an action. However, a single point trajectory only carries mo- tion information. It does not carry any shape or relative spa- tial information, which may be useful in action recognition....
View Full Document
- Spring '08
- STV, action descriptors