Unformatted text preview: Tracking and Modeling Non
Rigid Objects with Rank Constraints
L. Torresani, D. Yang, E. Alexander, and C. Bregler Presentation by:
Jeremy Weinsier Motivation
Can a nonrigid object’s shape and pose be determined from a single video sequence without any information about the particular scene?
In other words, can we determine a 3D model for the shape of the object without any prior knowledge? Background
In a previous paper, the authors proved that 2D point tracks from a single view (no stereo) are enough to recover the nonrigid motion as well as the structure of an object by exploiting low rank constraints.
This procedure takes that idea one step further: nonrigid motion and structure can now be recovered from a single view without any point tracks. Rank Constraints
Optical flow data is read into two F x P matrices, U and V. U W = Let V If W describes 3D rigid motion, then there is an upper bound on the rank of W. Rank Constraints
The particular bound is based on the assumed motion model. For example,
Orthographic: r≤4
Projective: r≤8
These rank constraints are derived from the fact that W can be factored into Q and S, which represent pose and shape. Rank Constraints
Nonrigid motion can also be factored into two matrices, but the rank of these matrices will be higher than that those in the rigid case.
Shape will be represented by K basis shapes, each described by a 3xP matrix. Rank Constraints
Assuming weak perspective projection: Rank Constraints
Eliminate T by subtracting the mean of all 2D points and rewrite as matrix multiplication: Rank Constraints
Combine all individual equations into one large matrix: Rank Constraints
Since Q is size 2Fx3K and B is size 3KxP, the rank of W, r≤3K.
This assumes that the sequence is free of noise and can be used as the bound on the rank of W in the case of non
rigid motion. Basis Flow
Assuming rank r, each column of W can be modeled as a linear combination of r “basis
tracks,” denoted Q. Q is estimated by removing all but the most reliable tracks in W and then computing its SVD. The first r eigenvectors of the SVD are taken as the first estimate of the basis
tracks.
This eigenbase is then applied to the entire W matrix to estimate all P tracks. Basis Flow
The Lucas Kanade optical flow equation: Basis Flow
Since the entire sequence is assumed to have a single image template, this equation could be rewritten as: Where C,D,E are PxP diagonal matrices containing c,d,e values for all patches and F,G are FxP matrices containing f and g values for each patch and frame. Basis Flow
Now split Q into two matrices, one containing the even rows of Q and the other containing the odd rows of Q. Since Q is a basis for W: Basis Flow
The determined values are used as initial estimates of the actual values.
The image is warped according to these values and the process iterates to refine these values until they are close to the actual values. Occlusion
By vectorizing the B matrix into a Pr dimensional vector b, occlusion can be handled:
Missing entries due to occlusion or mistracking are removed from the matrices on both sides. Occlusion
If enough points are still visible, the system is still overconstrained and can be solved.
The displacement of the missing points can later be estimated using Q*B. 3D Reconstruction
The factorization of W into Q and B is not unique. Other factorizations can be found by:
Q does not initially comply with its necessary structure: 3D Reconstruction
TomasiKanade suggest using a linear approximation scheme in the rigid case. The subblocks of Q are treated as rotation matrices.
A similar approach can be used in the nonrigid case, but a second factorization step is necessary. SubBlock Factorization SubBlock Factorization
Since Qt now has a rank of one, a non
linear optimization method is used to find an invertible A that orthonormalizes all of the subblocks.
This produces a matrix with scaled rotation matrices as its subblocks. Limitations
The limitation of subblock factorization is in the noisy and ambiguous cases. The second and higher eigenvalues will not disappear in some of the sub
blocks. This leads to bad rank1 approximations of Rt.
The alternative is an iterative technique that solves the system directly. Iterative Optimization
The first step in this method is the same as in subblock factorization. R is factored into Qrig and Brig. Qrig is then reorganized into a matrix with sub
blocks that are weak perspective rotation matrices. Iterative Optimization
Qrig is then used as an initial guess of the pose of the nonrigid object.
1.
2.
3. Linear least squares is used to find B.
Linear least squares is used to find L.
Solve for R, such that Rt fit the equation: Iterative Optimization
To solve for R such that Rt remain rotation matrices, Rt can be parameterized with exponential coordinates. Iterative Optimization
Linearizing the previous equation around the previous estimate leads to: These steps are iterated until convergence. Missing entries are handled as before. MultiView Input
If M cameras are used, the input matrix W is enlarged to size 2FW x P. MultiView Input
The process is the same as before, but now there is an additional constraint that the deformation must be the same for every camera in every frame. Basis Shapes
The number of basis shapes used has an effect on the error in the result. Here is some sample K vs. error data: Basis Shapes
The number of necessary basis shapes is not currently an automatic calculation.
A simple solution to this problem is to continuously increase K until the error is below a certain threshold. Results ...
View
Full
Document
This note was uploaded on 06/13/2011 for the course CAP 6412 taught by Professor Staff during the Spring '08 term at University of Central Florida.
 Spring '08
 Staff

Click to edit the document details