Unformatted text preview: larger than the cost of computing an estimate
for the derivative of entropy. Since pose does not change much between iterations of
gradient descent, it has proven su cient to Z-bu er every 300 iterations.
4. The derivative dvy=dy is the spatial gradient of the image.
5. The metric used for comparing points sampled from the image is squared di erence.
The representation of joint events, w = fvT x; uxg, is somewhat more complex. We
will represent only two dimensions of the normal vector: the x and y components. Since
the normal is always a unit vector, the z component is redundant. The joint events are
therefore three dimensional vectors, two components from the model and one from the
image. We will use Euclidean distance to measure the distance between joint events.
6. Since we will be using diagonal covariance matrices for the smoothing functions, four
variances are required. Three for the joint entropy and one for the image entropy. Based
on maximum likelihood estimates from aligned objects, we have settled on a single set
of smoothing parameters that we will use for all of our 3D alignment experiments. For
the joint entropy, the variance of x and y components of the normal are both 0.3 and
the variance for image intensity is 0.2. For the image entropy, the variance for image
intensity is 0.15. Having a single set of parameters is for every experiment is possible
in part because we have pre-normalized all images so that their variance is 1.0.
7. We will use a value of 0.01 for pmin . The alignment process shows very little sensitivity
to pmin . We have repeated a number of experiments with a pmin value of 0.1 and
108 5.1. ALIGNMENT OF 3D OBJECTS TO VIDEO AI-TR 1548 1.0. Our results are not signi cantly di erent. Values that are more than a factor
of 10 smaller than 0:01 cause the derivative of estimated entropy to be too noisy see
Section 3.3. This noise can prevent convergence to the correct pose.
8. Rather than draw two di erent samples, we will use the cross-validation approximation
see Section 2.4.3. In all of our experiments we use a sample size of 25.
9. Finally, we must choose a parameter update rate, . Actually, since the units of rotation and translation are very di erent two update rates are necessary. Internally we
represent rotations in radians and translations in millimeters. For an object with a 100
millimeter radius a rotation of 0.01 radians about the center of mass can translate a
model point up to 1 millimeter. A translation of 0.01 can at most translate a model
point 0.01 millimeters. The derivative of mutual information with respect to a model
point's position is a combination of a rotation and translation. A small step in the direction of the derivative will move the model point up to 100 times further by rotation
than translation. If there is only a single update rate a poor compromise must be made
between the rapid changes that arise from the rotation and the slow changes that arise
from translation. If the rotation update rate is reduced by a factor of 100 the model
point will move approximately as far by rotation as it does by translation. Scale issues
such as these do not arise when more complex gradient descent techniques are used, for
example conjugate gradient descent or Levenberg-Marquardt. Unfortunately, neither
of these techniques can use stochastic estimates of the gradient. Since our models have
a radius that is on the order of 100 millimeters, we have chosen rotation update rates
are 100 times smaller than translation rates. Most of our 3D alignment experiments
proceed in two stages. In the rst stage the rotation update rate is 0.0005 and the
translation update rate is 0.05. After a number of iterations the update rates are then
reduced to 0.0001 and 0.01 respectively. We have chosen a simple automatic descent
procedure in an e ort to simplify subsequent analysis of convergence. The realization of the basic framework is summarized in Table 5.2.
109 Paul A. Viola CHAPTER 5. ALIGNMENT EXPERIMENTS 1. De ne the model and image u and v: u contains points
distributed on the surface of the object. Each point has an associated normal. v is the image of intensities.
2. Sampling x: The sampling is determined by the distribution of
surface points which is close to uniform.
3. Transformation space T : The space of rigid 3D rotations and
translations by perspective projection using an estimate for the
4. De nition of dvy=dy: This is the intensity gradient.
5. Distance metric: Euclidean distance.
6. Variance, : Assuming diagonal covariance matrices, four different variance are necessary, three for the joint entropy estimate
and one for the image entropy estimate. The variances were 0.3,
0.3, and 0.2 for the x component of the normal, y component of
the normal, and image intensity. The variance was 0.15 for the
7. Minimum probability, pmin : 0.01.
8. Number of samples: One sample of 25 using cross-validation.
9. Update rate, : Rotation rate: 0.0005 f...
View Full Document
- Spring '10
- The Land, Probability distribution, Probability theory, probability density function, Mutual Information, Paul A. Viola