This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 6 Conditional Densities A number of machine learning algorithms can be derived by using condi tional exponential families of distribution (Section 2.3 ). Assume that the training set { ( x 1 y 1 ) . . . ( x m y m ) } was drawn iid from some underlying distribution. Using Bayes rule ( 1.15 ) one can write the likelihood p (  X Y ) p ( ) p ( Y  X ) = p ( ) m i =1 p ( y i  x i ) (6.1) and hence the negative loglikelihood log p (  X Y ) = m i =1 log p ( y i  x i ) log p ( ) + const. (6.2) Because we do not have any prior knowledge about the data, we choose a zero mean unit variance isotropic normal distribution for p ( ). This yields log p (  X Y ) = 1 2 2 m i =1 log p ( y i  x i ) + const. (6.3) Finally, if we assume a conditional exponential family model for p ( y  x ), that is, p ( y  x ) = exp ( ( x y ) g (  x )) (6.4) then log p (  X Y ) = 1 2 2 + m i =1 g (  x i ) ( x i y i...
View
Full
Document
 Spring '08
 Staff

Click to edit the document details