1995_Viola_thesis_registrationMI

A number of randomized experiments were performed to

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: periments have we pre-segmented the image. The initial poses often project the model into regions of the image that contain a signi cant amount of clutter. EMMA reliably settles on a pose where few if any of the model points project onto the background. In answer to the second question, EMMA requires roughly 35 seconds on a Sun SparcStation5 for each of the alignments shown above. Run times are identical because we have chosen to use a xed number of update iterations for each alignment experiment. In some cases an accurate alignment was obtained well before the full number of iterations had been completed. In others it appeared that the nal alignment could have been improved if the number of iterations were increased. There are few if any principled results on the convergence of stochastic approximation. Convergence detection is a subtle issue. For example, EMMA does not make a direct estimate of the mutual information between model and image. During alignment only a stochastic estimate of the gradient is available. It may be possible to construct an ad hoc procedure that would be able to detect convergence. Alignment could then be continued until the pose estimate had converged. From an analysis of the program's memory access and computation patterns, we conclude that an implementation on a digital signal processor would be as much as 100 times faster than our current implementation. One major issue is cache performance. Because EMMA 114 5.1. ALIGNMENT OF 3D OBJECTS TO VIDEO AI-TR 1548 Figure 5.5: Final pose of the skull model after alignment. randomly accesses each of the points in the image and model, much time is wasted ushing and re lling the cache. The cache on a general purpose processor is often fairly limited. Most digital signal processors include a large quantity of fast SRAM, eliminating the need for a cache. For random memory accesses a digital signal processor should be approximately 5 times faster than a conventional computer. The inner loop of the EMMA derivative estimation procedure is dominated by simple oating point operations. Modern digital signal processors can execute these instructions 10 to 20 times faster than conventional computers. Together these advantages should lead to an overall improvement in speed of between 50 and 100. A number of randomized experiments were performed to determine the reliability, accuracy and repeatability of alignment. This data is reported in Table 5.3. An initial alignment was performed to establish a base pose. This pose, shown in Figure 5.5, is used as a point of reference. A set of randomized experiments was performed where the base pose is rst perturbed, and then EMMA is used to re-align the image and model. The perturbation is computed as follows: a random uniformly distributed o set is added to each translational axis labeled T  and then the model is rotated about a randomly selected axis by a random uniformly selected angle  . There were four experiments each including 50 random initial poses. The distribution of the nal and initial poses can be compared by comparing the variance of the location of the centroid, computed separately in X, Y and Z. Furthermore, the average angular rotation from the true pose is computed labeled j 4 j. Finally, the number of poses that failed to converge near the correct solution is reported. The nal statistics are 115 Paul A. Viola CHAPTER 5. ALIGNMENT EXPERIMENTS Figure 5.6: Final pose of the skull model after alignment. 4T XYZ mm 10 30 20 10; 20 4 INITIAL X 10 10 20 20; 40 5.94 16.53 10.12 14.83 Y Z j 4 j FINAL X mm 5.56 6.11 5.11 .61 18.00 16.82 5.88 1.80 12.04 10.77 11.56 1.11 15.46 14.466 28.70 1.87 Y mm .53 .81 .41 2.22 Z j 4 j 5.49 14.56 9.18 14.19 3.22 2.77 3.31 3.05  100 96 96 78 Table 5.3: Skull Results Table. The nal column contains the percentage of poses that successfully converged to a pose near the correct pose. only evaluated over the poses that converged near the correct solution. These experiments demonstrate that the alignment procedure is reliable when the initial pose is close to the correct" pose. Outside of this range gradient descent, by itself, is not capable of converging to the correct solution. The capture range is not unreasonably small however. Translations as large as half the diameter of the skull can be accommodated, as can rotations in the plane of up to 45 degrees. Empirically it seems that alignment is most sensitive to rotation in depth. This is not terribly surprising since only the visible points play a role in the calculation of the derivative. As a result, when the chin is hidden the derivative gives you no information about how move the chin out from behind the rest of the skull. Finally, we have done a number of experiments to demonstrate that EMMA alignment 116 5.1. ALIGNMENT OF 3D OBJECTS TO VIDEO AI-TR 1548 Figure 5.7: Final pose of the skull model after alignment. can deal with occlusion. Figure 5.9 shows an initial and nal alignment for an image that includes an arti cial occlusion...
View Full Document

Ask a homework question - tutors are online