1995_Viola_thesis_registrationMI

This would require a message that is on average as

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ncrease the mutual information, then the e ect of the rst term in the brackets may be interpreted as acting to increase the squared distance between pairs of samples that are nearby in image intensity, while the second term acts to decrease the squared distance between pairs of samples that are nearby in both image intensity and the model properties. It is important to emphasize that distances are in the space of values intensities, brightness, or surface properties, rather than coordinate locations. d The term dT vi , vj  will generally involve gradients of the image intensities and the derivative of transformed coordinates with respect to the transformation. In the simple case that T is a linear operator, we obtain the following outer product expression, d vT x  = rvT x xT : i ii dT 4.4 Matching and Minimum Description Length There is another entirely di erent motivation for using mutual information as a alignment metric. Alignment, and many other vision problems, can be reformulated as minimum description length MDL problems Rissanen, 1978; Leclerc, 1988. MDL can provide us with some new insight into the problem of alignment and help us derive a missing and often useful term in the alignment equations. The standard framework of MDL involves a sender and a receiver communicating descriptions of images. Given that the sender and the receiver have an agreed upon language for describing images, the sender's goal is to nd a message that will accurately describe an image in the fewest bits. The concept of description length is clearly related to the code length introduced in Section 2.2. For the problem of alignment we will assume that the sender and the receiver share the same set of object models. The sender's goal is to communicate an image of one of these 101 Paul A. Viola CHAPTER 4. MATCHING AND ALIGNMENT models. Knowing nothing else, the sender could simply ignore the models and send a message describing the entire image. This would require a message that is on average as long as the entropy of the image. However, whenever the image is an observation of a model a more e cient approach is possible. For example the sender could send the pose of the model and a description for how to render it. From these the receiver can reconstruct the part of the original image in which the model lies. To send the entire image, the sender need only encode the errors in this reconstruction, if there are any, and any part of the image that is unexplained by the model. Alignment can be thought of as the process by which the sender attempts to nd the model pose that minimizes the code length of the overall message. The encoding of the entire image has several parts: 1 a message describing the pose; 2 a message describing the imaging function; 3 a message describing the errors in the reconstruction; and 4 a message describing the parts of the image unexplained by the model. The length of each part of the message is proportional to its entropy. We can assume that poses are uniformly distributed, and that sending a pose incurs some small uniform cost. The length of part 4 is the entropy of the image that is unexplained. Parts 2 and 3 can be interpreted in two ways. We can assume that the imaging function can be sent with a xed or small cost. Part 3 is then proportional to the conditional entropy of the image given the model and imaging function. This is precisely what was estimated and minimized with weighted neighbor alignment. A second interpretation comes from EMMA. EMMA estimates the joint entropy of the model and image, hu; v. The conditional entropy of the image given the model can be computed as hvju = hu; v , hu. Since the entropy of the model is xed, minimizing the joint entropy minimizes the conditional entropy. In both cases entropy based alignment as proposed in the rst part of this chapter minimizes the cost of sending parts 1, 2 and 3. MDL suggests that we must also minimize the entropy of the unmodeled part of the image. In the previous information theoretic formulation there was no concept of pixels or of the proportion of the image explained by the model. In fact, in the previous formulation the entropy of the explained part of the image could get larger as the model shrunk. For example, assume that the model covers a contiguous region of an image where most of the pixels have constant value. At the center of this region is a small patch containing varied pixels. Recall that the image is sampled at points that are projected from the model. Most of the model points will project into the region of constant intensity and a few will project onto the varied patch. The resulting distribution of image pixels, because it has many samples of the same 102 4.4. MATCHING AND MINIMUM DESCRIPTION LENGTH AI-TR 1548 value, has fairly low entropy. If the model were shrunk to cover only the varied patch, then all of the points from the model would fall in the varied region. The new distribution of pixel values will hav...
View Full Document

Ask a homework question - tutors are online