Notes onthe Infomax Algorithm
Upamanyu Madhow
Abstract
We briefly review the maximum likelihood interpretation of the extended Infomax algo
rithm for independent component analysis (ICA), including the concept of relative gradient
used for iterative updates.
1 Maximum Likelihood Formulation
Consider a single snapshot of the mixing model
X
=
AS
where
X
,
S
are
n
×
1, and
A
is
n
×
n
. We would like to “unmix” the sources by applying an
n
×
n
matrix
W
to get
Y
=
WX
In maximum likelihood (ML) estimation, we estimate a parameter
θ
based on observation
x
by
maximizing the conditional density
p
(
x

θ
). In order to apply this approach to estimation of
W
,
we must know the conditional density of
x
given
W
.
Given
W
, we can compute
Y
=
WX
,
and we apply ML estimation to this setting by assuming that we know the density of
Y
. For
the “right”
W
, we assume that (a) the components of
Y
are independent, (b) they have known
marginal densities
p
i
(
y
i
),
i
= 1
, .., n
.
In practical terms, these marginal densities do not need to be the same as those of the actual
independent components:
all they do is to provide nonlinearities of the form
d
dy
i
log
p
(
y
i
) for
iterative update of
W
.
As we have seen from our discussion of the fastICA algorithm, there
are a broad range of nonlinearities that can move us towards nonGaussianity and independence
(although only the fourth order nonlinearity is guaranteed to converge to a global optimum).
Thus, it makes sense that there should be some flexibility in the choice of nonlinearities in
the Infomax algorithm, which is essentially similar in philosophy (except that it uses different
nonlinearities and a gradientbased update rather than a Newton update).
Equating the probabilities of small volumes, we have
p
(
x

W
)

d
x

=
p
(
y
)

d
y

Since

d
y


d
x

=

det
(
W
)

we have
p
(
x

W
) =
p
(
y
)

det
(
W
)

1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Taking the log and using the independence of the components of
Y
, we obtain that the cost
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '10
 MADHOW
 Independent Component Analysis, log pi, relative gradient, Eik yk, infomax algorithm, yi tanh

Click to edit the document details