infomax_notes - Notes on the Infomax Algorithm Upamanyu...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Notes on the Infomax Algorithm Upamanyu Madhow Abstract We briefly review the maximum likelihood interpretation of the extended Infomax algo- rithm for independent component analysis (ICA), including the concept of relative gradient used for iterative updates. 1 Maximum Likelihood Formulation Consider a single snapshot of the mixing model X = AS where X , S are n 1, and A is n n . We would like to unmix the sources by applying an n n matrix W to get Y = WX In maximum likelihood (ML) estimation, we estimate a parameter based on observation x by maximizing the conditional density p ( x | ). In order to apply this approach to estimation of W , we must know the conditional density of x given W . Given W , we can compute Y = WX , and we apply ML estimation to this setting by assuming that we know the density of Y . For the right W , we assume that (a) the components of Y are independent, (b) they have known marginal densities p i ( y i ), i = 1 , .., n . In practical terms, these marginal densities do not need to be the same as those of the actual independent components: all they do is to provide nonlinearities of the form d dy i log p ( y i ) for iterative update of W . As we have seen from our discussion of the fastICA algorithm, there are a broad range of nonlinearities that can move us towards non-Gaussianity and independence (although only the fourth order nonlinearity is guaranteed to converge to a global optimum). Thus, it makes sense that there should be some flexibility in the choice of nonlinearities in the Infomax algorithm, which is essentially similar in philosophy (except that it uses different nonlinearities and a gradient-based update rather than a Newton update). Equating the probabilities of small volumes, we have p ( x | W ) | d x | = p ( y ) | d y | Since | d y | | d x | = | det ( W ) | we have p ( x | W ) = p ( y ) | det ( W ) | 1 Taking the log and using the independence of the components of Y , we obtain that the cost...
View Full Document

This note was uploaded on 12/29/2011 for the course ECE 594C taught by Professor Madhow during the Fall '10 term at UCSB.

Page1 / 4

infomax_notes - Notes on the Infomax Algorithm Upamanyu...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online