Turk and pentland have used a large collection of

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ction 5.1.3, because neither edges, nor their precursor, gradient magnitude, are stable in image position with respect to lighting changes see Figure 5.16. While our technique works well using only shading, it also works well in domains having surface property discontinuities and silhouette information see Section 5.1.1. 147 Paul A. Viola CHAPTER 7. CONCLUSION 7.1 Related Work Alignment by extremizing properties of the joint signal has been used by Hill and Hawkes Hill et al., 1994 to align MRI, CT, and other medical image modalities. They use third order moments to characterize the clustering of the joint data. We believe that mutual information is perhaps a more direct measure of the salient property of the joint data at alignment, and demonstrate an e cient means of estimating and extremizing it. There are many schemes that represent models and images by collections of edges and de ne a distance metric between them that is proportional to the number of edges that coincide see the excellent survey articles: Besl and Jain, 1985; Chin and Dyer, 1986. A smooth, optimizable version of this metric can be de ned by introducing a penalty both for unmatched edges and for the distance between those that are matched Lowe, 1985; Wells III, 1992b; Huttenlocher et al., 1991. This metric can then be used both for image model comparison and for pose re nement. Edge based metrics can work under a variety of di erent lighting conditions, but they make two very strong assumptions: the edges that arise are stable under changes in lighting, and the models are well described as a collection of edges. Clearly smoothly curved objects are a real problem for these techniques. As we alluded before, Wells has performed a number of experiments where he attempts to match edges that are extracted under varying lighting. In general for even moderately curved objects, the number of unstable and therefore unreliable edges is problematic. Faces, cars, fruit and a myriad of other objects have proven to be very di cult to model using edges. Others use more direct techniques to build models. Generally these approaches revolve around the use of the image itself as an object model. Objects need not have edges to be well represented in this way, but care must be taken to deal with changes in lighting and pose. Turk and Pentland have used a large collection of face images to train a system to construct representations that are invariant to some changes in lighting and pose Turk and Pentland, 1991. These representations are a projection onto the largest eigenvectors of the distribution of images within the collection. Their system addresses the problem of recognition rather than alignment, and as a result much of the emphasis and many of the results are di erent. For instance, it is not clear how much variation in pose can be handled by their system. We do not see a straightforward extension of this or similar eigenspace work to the problem of pose re nement. On a related note Shashua has shown that all of the images, under di erent lighting, of a Lambertian surface are a linear combination of any three of the images 148 7.2. A PARALLEL WITH GEOMETRICAL ALIGNMENT AI-TR 1548 Shashua, 1992. This also bears a clear relation to the work of Turk and Pentland in that the eigenvectors of a set of images of an object should span this three dimensional space. Entropy is playing an ever increasing role within the eld of neural networks. We know of no work on the alignment of models and images, but there has been work using entropy and information in vision problems. None of these techniques uses a non-parametric scheme for density entropy estimation as we do. In most cases the distributions are assumed to be either binomial or Gaussian. This both simpli es and limits such approaches. Linsker has used the concept of information maximization to motivate a theory of development in the primary visual cortex Linsker, 1986. He has been able to predict the development of receptive elds that are very reminiscent of the ones found in the primate visual cortex. He uses a Gaussian model both for the signal and the noise. Becker and Hinton have used the maximization of mutual information as a framework for learning di erent low-level processing algorithms such as disparity estimation and curvature estimation Becker and Hinton, 1992. They assume that the signals whose mutual information is to be maximized are Gaussian. In addition, they assume that the only joint information between images is the information that they wish to extract i.e. they train their disparity detectors on random dot stereograms. Finally, Bell has used a measure of information to separate signals that have been linearly mixed together Bell and Sejnowski, 1995. His technique assumes that the di erent mixed signals carry little mutual information. While he does not assume that the distribution has a particular functional form, he does assume that the distribution is well matched to a preselected transfer function. For example, a Gaussian is well matched to the logistic function because applying a correctly positioned and scaled logistic function results in a uniform distribution. 7.2 A Parallel with Geometric...
View Full Document

This note was uploaded on 02/10/2010 for the course TBE 2300 taught by Professor Cudeback during the Spring '10 term at Webber.

Ask a homework question - tutors are online