This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Notes on the Infomax Algorithm Upamanyu Madhow Abstract We briefly review the maximum likelihood interpretation of the extended Infomax algo rithm for independent component analysis (ICA), including the concept of relative gradient used for iterative updates. 1 Maximum Likelihood Formulation Consider a single snapshot of the mixing model X = AS where X , S are n 1, and A is n n . We would like to unmix the sources by applying an n n matrix W to get Y = WX In maximum likelihood (ML) estimation, we estimate a parameter based on observation x by maximizing the conditional density p ( x  ). In order to apply this approach to estimation of W , we must know the conditional density of x given W . Given W , we can compute Y = WX , and we apply ML estimation to this setting by assuming that we know the density of Y . For the right W , we assume that (a) the components of Y are independent, (b) they have known marginal densities p i ( y i ), i = 1 , .., n . In practical terms, these marginal densities do not need to be the same as those of the actual independent components: all they do is to provide nonlinearities of the form d dy i log p ( y i ) for iterative update of W . As we have seen from our discussion of the fastICA algorithm, there are a broad range of nonlinearities that can move us towards nonGaussianity and independence (although only the fourth order nonlinearity is guaranteed to converge to a global optimum). Thus, it makes sense that there should be some flexibility in the choice of nonlinearities in the Infomax algorithm, which is essentially similar in philosophy (except that it uses different nonlinearities and a gradientbased update rather than a Newton update). Equating the probabilities of small volumes, we have p ( x  W )  d x  = p ( y )  d y  Since  d y   d x  =  det ( W )  we have p ( x  W ) = p ( y )  det ( W )  1 Taking the log and using the independence of the components of Y , we obtain that the cost...
View
Full
Document
This note was uploaded on 12/29/2011 for the course ECE 594C taught by Professor Madhow during the Fall '10 term at UCSB.
 Fall '10
 MADHOW

Click to edit the document details