1 to the human eye this sort of image is more readily

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: or 3200 steps and then 0.0001 for 3200 steps. Translation rate: 0.05 and then 0.01. Table 5.2: Summary of 3D to video alignment. 110 5.1. ALIGNMENT OF 3D OBJECTS TO VIDEO AI-TR 1548 Figure 5.1: A typical image of the skull object. 5.1.1 Alignment of Skull Model In our rst experiment we will align a 3D skull model to a number of di erent images. The skull model was produced automatically from a Computed Tomography CT scan of a plastic skull1. The same plastic skull was then photographed in a number of di erent poses under natural lighting2. The skull model contains 65000 points. The video images are 240 by 320 pixels. Figure 5.1 is an example video image of the skull. Figure 5.2 contains a representation of the shape of the skull model. It is an image displaying the distance from the camera to the visible points on the skull model. White is further and black nearer. This image is computed by projecting each model point into the image plane. The pixel to which the model point projects records the distance of the model point from the camera. There may, however, be a number of model points that project to the same image pixel. In this case, the depth of the model point which is nearest the camera is used. Since the model is constructed from a collection of points, it is not dense. As a result there are some pixels to which no model point projects. A few of these pixels, which remain white, appear throughout the model. Thanks to Gil Ettinger and Ron Kikinis for providing the skull model. Their work on medical registration using this model is described in Grimson et al., 1994. 2 Thanks to J. P. Mellor for providing the skull images. His work on registration is described in Mellor, 1995. 1 111 Paul A. Viola CHAPTER 5. ALIGNMENT EXPERIMENTS Figure 5.2: A depth map of the skull model. See text for description. We can use a Lambertian re ectance model to render a graphical picture of the skull. A Lambertian model relates the model normal and image intensity: X~ vTx = 5.1 i li  ux ; i where the model value ux is the normal vector of a surface patch, li is a vector pointing toward light source i, and i is proportional to the intensity of that light source. Figure 5.3 shows a rendered version of the model in the same pose as Figure 5.1. To the human eye this sort of image is more readily interpretable than a depth map. We can bring to bear our substantial visual competence when the shape of an object is rendered as an image. From Figure 5.3 it is almost immediately clear that the pose of the object model is close to correct. There is however no simple relationship between the intensities of the video image and the rendered image. The goals of this rst experiment is to answer three questions. 1 Can EMMA align a complex 3D object model to a number of di erent images taken under uncontrolled lighting? 2 How long does EMMA alignment take to run? 3 What is the range of poses from which a correct" alignment can be obtained. Regarding this third point, we do not have true information about either the pose of the object nor the camera parameters of the video camera. The correct" pose has been determined by inspection of the alignment results. We can however ask a related question about reliability. How far can the the object be perturbed 112 5.1. ALIGNMENT OF 3D OBJECTS TO VIDEO AI-TR 1548 Figure 5.3: A rendered image of the skull model. away from the correct" pose and have EMMA alignment reliably re-align it? To answer our rst question, we must establish that in the six dimensional space of rigid transformations there is a maximum of mutual information at a plausible alignment pose. For each image the object model was initially adjusted so that it's pose was close to correct. This was done by eye. EMMA alignment was then used to pull the object into a correct" pose. One scheme for assessing the quality of an alignment is to display the model pose and the video image together. This can be done by taking a random collection of model points and projecting them into the coordinate frame of the image. The pixels to which the model points project are then set to white. The nature of the alignment is readily apparent from such images. When the model and image are misaligned model points will project onto the background and the coverage of the object's image will be incomplete. When the model and image are correctly aligned there is close agreement between the occluding contours of the model points and the object's image. Figure 5.4 shows an initial incorrect pose in this way. Figure 5.5 shows the nal pose obtained after running EMMA alignment. Figures 5.6, 5.7, and 5.8 show the nal alignment obtained for three other images. Notice that in each of these images the boundaries of the skull are in close agreement with the outline of the model points. 113 Paul A. Viola CHAPTER 5. ALIGNMENT EXPERIMENTS Figure 5.4: Initial pose of the skull model before alignment. We would like to emphasize that in none of these ex...
View Full Document

This note was uploaded on 02/10/2010 for the course TBE 2300 taught by Professor Cudeback during the Spring '10 term at Webber.

Ask a homework question - tutors are online