1995_Viola_thesis_registrationMI

# 78 41 alignment ai tr 1548 2 15 1 05 0 05 1 15 2

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: formation between a model and an image 77 Paul A. Viola CHAPTER 4. MATCHING AND ALIGNMENT u(x) v(x) Intensity 1 0.5 0 -0.5 -1 0 100 200 300 Position 400 Figure 4.1: Graph of ux and vx = ux + versus x. is not known. Alignment is the process by which the correct transformation is extracted. Alignment can be a di cult problem for a number of reasons: The imaging function F of the physical world can be di cult to model. The exogenous parameters q are not necessarily known and can be di cult to nd. For example computing the lighting in an image is a non-trivial problem. The space of transformations, which may have many dimensions, is di cult to search. Rigid objects often have a 6 dimensional transformation space. Non-rigid objects can in principle have an unbounded number of pose parameters. A simple example can lend intuition to these de nitions. Let ux and vy be one dimensional signals. Let the transformation space be the space of all possible translations T x = x , : 4.3 Let the imaging function F be the identity function. Choosing = 0 leads to vx = ux + : 4.4 Figure 4.1 contains a graph of two signals that obey this relationship. Though we show the image and model aligned, the correct alignment between v to u may not be known. 78 AI-TR 1548 Intensity 4.1. ALIGNMENT 2 1.5 1 0.5 0 -0.5 -1 -1.5 -2 -2.5 -3 u(x) v(x) 0 100 200 300 Position 400 Figure 4.2: Graph of ux and vx = ,ux2 versus x. In all of our synthetic experiments 10 random noise has been added to v1. Noise is of course an unavoidable reality of any real system. But more importantly the addition of noise demonstrates that the algorithms presented are numerically stable. More complex imaging functions are possible. For example, F might be non-linear F u = ,u2 : 4.5 Figure 4.2 contains a graph of ux and vT x = F ux. 4.1.1 Correlation as a Maximum Likelihood Technique The search for the correct alignment can be cast as a maximum likelihood or variance minimization problem see Section 2.3.1. The probability of an image given the model, the transformation, the noise distribution, the exogenous parameters, and the imaging function is: Y pv j u; T; ; q; F  = p  = vT xa , F uxa; q  : 4.6 x a 2a We use white noise that has been low-passed ltered to roughly 0.3 cycles per unit. The peak to peak amplitude of the noise is 10 of the peak to peak amplitude of the signal. 1 79 Paul A. Viola CHAPTER 4. MATCHING AND ALIGNMENT In the above equation we have assumed that each pixel of v is conditionally independent. Conditional independence does not imply that the pixels are independent, just that if u; T; ; q; and F are known the pixels are independent. Assuming that the noise is Gaussian, we can then compute the log likelihood of a transformation as log`T  = log pv j u; T; ; q; F  X = log p  = vT xa , F uxa; q xa 2a X = ,k1 vT xa , F uxa; q2 xa2a h i  ,k2E vT X  , F uX ; q2 h i h i  ,k2E vT X 2 , 2E vT X F uX ; q + E F uX ; q2 4.7 4.8 4.9 4.10 4.11 where k1 and k2 are constants computed from the variance of the noise and the number of sample points. They play no role in the maximization. In 4.11 we have expanded the square of the di erence to show that the log likelihood of a transformation has three components: one that arises from the variance of the model; a second that arises from the correlation between the image and the predicted image; and a third that arises from the variance of the predicted image. For problems where the variance of the image and predicted image are xed, the best transformation is the one that maximizes the correlation between the actual and predicted image. For convenience we will de ne the cost of a transformation as h i C T  = E vT X  , F uX ; q2 ,log`T  : 4.12 4.13 The lowest cost transformation is the one that causes the model to match the image best". As we did in the analysis of principal components and function learning, we have invented" random variables: X , vT X  and uX . The random variable X ranges over points from the coordinate system of u where uX  is de ned. The random variables uX  and vT X  range over the values in the model and image. In reality there are no random processes involved in matching and alignment. The model and image are pre-determined and xed. Alignment could proceed deterministically; the cost of a transformation being evaluated directly from all of the points in the model and image. We have chosen to interpret 80 4.1. ALIGNMENT AI-TR 1548 the summation over pixels that arises in correlation as an expectation over a set of random variables. As a result the insights of probability and statistics can be brought to bear on these problems. 4.1.2 Correlation and Mutual Information Alignment is very similar to the problem of function learning that we encountered in Section 3.1. Equation 3.15 is almost identical to 4.7. In both problems we are looking...
View Full Document

## This note was uploaded on 02/10/2010 for the course TBE 2300 taught by Professor Cudeback during the Spring '10 term at Webber.

Ask a homework question - tutors are online