For example one kind of ow might transform a neutral

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 95. Their system can estimate the emotional content of these faces i.e. it can determine that you have drawn a very happy face that is somewhat surprised. Their 142 6.2. ALIGNMENT OF LINE DRAWINGS AI-TR 1548 Figure 6.6: At left is the MRI scan shown with a modi ed intensity scale. The inhomogeneity is not much more apparent. In the center is the correct scan shown using the same intensity scale. At right is the estimated correction eld. Figure 6.7: A coronal slice from an MRI scan of a head. system works by constructing a non-rigid transformation that maps a novel drawing onto a hand drawn neutral" face see gure Figure 6.10. The shape of the non-rigid transformation determines the emotion of the face. Jones and Poggio use a representation for non-rigid transformations that is called ow. A ow is an image of displacement vectors. They search for a ow that minimizes the di erence between the base image bx and the novel image nx, X C f  = nx , bx + f x2 : 6.1 where f is the ow image and the summation is over all of the pixels in the novel image. The problem of non-rigidly transforming one image into another has been very heavily studied in computer vision. In most previous work ow is represented directly as an image of displace143 Paul A. Viola CHAPTER 6. OTHER APPLICATIONS OF EMMA Corrupted Corrected 1200 1000 800 600 400 200 0 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Figure 6.8: The distribution of pixel value in the MRI scan before and after correction. Solid line is the original distribution. The dashed line is the corrected distribution. Figure 6.9: The same scan but with a modi ed intensity scale. Points above the intensity for grey matter appear white. Points below the intensity for white matter appear black. There is a linear scale between. Notice that the lower part of the image is darker in the uncorrected image. On the right is the estimated bias. ment vectors. The search problem is then conditioned with the addition of a smoothness prior over ow images. Jones and Poggio decompose ow into a linear combination of component ows, !2 X X C f ig = nx , bx + ifix : 6.2 i The search for ow then becomes an unconstrained optimization over the parameters i. Each component ow represents a di erent type of emotion. For example, one kind of ow might transform a neutral face into a smiling face. Another might make a face look more surprised. These ows can be mixed together to produce images that have combination 144 6.2. ALIGNMENT OF LINE DRAWINGS AI-TR 1548 Figure 6.10: The rst face is the neutral" face. The others are frown", narrow eyes", surprise", eyebrows", and smile". of properties. Given a novel image, a set of i's are determined that provide for the closest match with the neutral face. These i's determine to what extent the face is smiling or frowning. Figure 6.11 shows several novel images and the best reconstruction obtained by transforming the neutral face. In their paper Jones and Poggio use Levenberg-Marquardt, a second order gradient descent procedure, to determine the ow parameters see William H. Press and Veterling, 1992 for an excellent discussion optimization techniques. Together we have replaced this technique with a much simpler stochastic gradient descent procedure. Over the course of many experiments stochastic alignment has improved running times from 60 seconds to 2 seconds. The quality of the minima found with stochastic gradient descent is equivalent to, if not better, than Levenberg-Marquardt. Moreover, stochastic gradient descent rarely if ever gets stuck in local minima. There were a number of cases where Levenberg-Marquardt converges far from the the best solution. On these same problems, stochastic gradient descent almost always nds a good solution. 145 Paul A. Viola CHAPTER 6. OTHER APPLICATIONS OF EMMA Figure 6.11: A series of novel faces with the best reconstruction displayed below. 146 Chapter 7 Conclusion Maximization of mutual information appears to be a new and powerful means of performing local alignment of objects and images. In a typical vision application it is an intensity-based, rather than feature based method. While intensity based, it is more robust than traditional correlation, as shown by the insensitivity to lighting demonstrated in the experiment of Section 5.1.3. In addition, the method is insensitive to negating the image data, as well as a variety of non-linear transformations, which would defeat conventional intensity-based correlation. The weaknesses of intensity correlation may be corrected, to some extent, by performing correlations on the magnitude of the brightness gradient. This, as well as edge-based matching techniques, can perform well on objects having discontinuous surface properties, or useful silhouettes. These approaches work because the image counterparts of these discontinuities are reasonably stable with respect to illumination. Gradient magnitude correlation, as well as edge-based methods can have serious di culties in domains lacking discontinuities, such as the example shown in Se...
View Full Document

This note was uploaded on 02/10/2010 for the course TBE 2300 taught by Professor Cudeback during the Spring '10 term at Webber.

Ask a homework question - tutors are online