This preview shows page 1. Sign up to view the full content.
Unformatted text preview: formation between a model and an image
77 Paul A. Viola CHAPTER 4. MATCHING AND ALIGNMENT u(x)
v(x) Intensity 1
0.5
0
0.5
1 0 100 200
300
Position 400 Figure 4.1: Graph of ux and vx = ux + versus x.
is not known. Alignment is the process by which the correct transformation is extracted.
Alignment can be a di cult problem for a number of reasons:
The imaging function F of the physical world can be di cult to model.
The exogenous parameters q are not necessarily known and can be di cult to nd. For
example computing the lighting in an image is a nontrivial problem.
The space of transformations, which may have many dimensions, is di cult to search.
Rigid objects often have a 6 dimensional transformation space. Nonrigid objects can
in principle have an unbounded number of pose parameters.
A simple example can lend intuition to these de nitions. Let ux and vy be one
dimensional signals. Let the transformation space be the space of all possible translations T x = x , : 4.3 Let the imaging function F be the identity function. Choosing = 0 leads to vx = ux + : 4.4 Figure 4.1 contains a graph of two signals that obey this relationship. Though we show
the image and model aligned, the correct alignment between v to u may not be known.
78 AITR 1548 Intensity 4.1. ALIGNMENT 2
1.5
1
0.5
0
0.5
1
1.5
2
2.5
3 u(x)
v(x) 0 100 200
300
Position 400 Figure 4.2: Graph of ux and vx = ,ux2 versus x.
In all of our synthetic experiments 10 random noise has been added to v1. Noise is of
course an unavoidable reality of any real system. But more importantly the addition of noise
demonstrates that the algorithms presented are numerically stable.
More complex imaging functions are possible. For example, F might be nonlinear F u = ,u2 : 4.5 Figure 4.2 contains a graph of ux and vT x = F ux. 4.1.1 Correlation as a Maximum Likelihood Technique
The search for the correct alignment can be cast as a maximum likelihood or variance minimization problem see Section 2.3.1. The probability of an image given the model, the
transformation, the noise distribution, the exogenous parameters, and the imaging function
is:
Y
pv j u; T; ; q; F = p = vT xa , F uxa; q :
4.6
x a 2a We use white noise that has been lowpassed ltered to roughly 0.3 cycles per unit. The peak to peak
amplitude of the noise is 10 of the peak to peak amplitude of the signal.
1 79 Paul A. Viola CHAPTER 4. MATCHING AND ALIGNMENT In the above equation we have assumed that each pixel of v is conditionally independent.
Conditional independence does not imply that the pixels are independent, just that if u; T; ; q;
and F are known the pixels are independent. Assuming that the noise is Gaussian, we can
then compute the log likelihood of a transformation as log`T = log pv j u; T; ; q; F
X
= log p = vT xa , F uxa; q
xa 2a
X
= ,k1 vT xa , F uxa; q2
xa2a
h
i
,k2E vT X , F uX ; q2
h
i
h
i
,k2E vT X 2 , 2E vT X F uX ; q + E F uX ; q2 4.7
4.8
4.9
4.10
4.11 where k1 and k2 are constants computed from the variance of the noise and the number of
sample points. They play no role in the maximization. In 4.11 we have expanded the square
of the di erence to show that the log likelihood of a transformation has three components: one
that arises from the variance of the model; a second that arises from the correlation between
the image and the predicted image; and a third that arises from the variance of the predicted
image. For problems where the variance of the image and predicted image are xed, the best
transformation is the one that maximizes the correlation between the actual and predicted
image.
For convenience we will de ne the cost of a transformation as
h
i
C T = E vT X , F uX ; q2
,log`T : 4.12
4.13 The lowest cost transformation is the one that causes the model to match the image best".
As we did in the analysis of principal components and function learning, we have invented" random variables: X , vT X and uX . The random variable X ranges over points
from the coordinate system of u where uX is de ned. The random variables uX and
vT X range over the values in the model and image. In reality there are no random processes involved in matching and alignment. The model and image are predetermined and
xed. Alignment could proceed deterministically; the cost of a transformation being evaluated directly from all of the points in the model and image. We have chosen to interpret
80 4.1. ALIGNMENT AITR 1548 the summation over pixels that arises in correlation as an expectation over a set of random
variables. As a result the insights of probability and statistics can be brought to bear on
these problems. 4.1.2 Correlation and Mutual Information
Alignment is very similar to the problem of function learning that we encountered in Section 3.1. Equation 3.15 is almost identical to 4.7. In both problems we are looking...
View
Full
Document
This note was uploaded on 02/10/2010 for the course TBE 2300 taught by Professor Cudeback during the Spring '10 term at Webber.
 Spring '10
 Cudeback
 The Land

Click to edit the document details