Chapter22_TAMU.pdf - ECEN 489 Information Theory Inference and Learning Algorithms Chapter 22 Maximum Likelihood and Clustering Dr Chao TIAN Texas A&M

Chapter22_TAMU.pdf - ECEN 489 Information Theory Inference...

• 29

This preview shows page 1 - 8 out of 29 pages.

ECEN 489: Information Theory, Inference, and Learning Algorithms Chapter 22: Maximum Likelihood and Clustering Dr. Chao TIAN Texas A&M University 1 / 20
Maximum Likelihood Choose the parameter θ that maximizes P ( { x }| θ ). The Gaussian example: log likelihood ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - n ( x n - μ ) 2 2 σ 2 . Sample mean: ¯ x , N n =1 x n / N ; Sample squared deviation: S , N n =1 ( x n - ¯ x ) 2 . ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - N ( μ - ¯ x ) 2 + S 2 σ 2 . 2 / 20
Maximum Likelihood Choose the parameter θ that maximizes P ( { x }| θ ). The Gaussian example: log likelihood ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - n ( x n - μ ) 2 2 σ 2 . Sample mean: ¯ x , N n =1 x n / N ; Sample squared deviation: S , N n =1 ( x n - ¯ x ) 2 . ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - N ( μ - ¯ x ) 2 + S 2 σ 2 . 2 / 20
Maximum Likelihood Choose the parameter θ that maximizes P ( { x }| θ ). The Gaussian example: log likelihood ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - n ( x n - μ ) 2 2 σ 2 . Sample mean: ¯ x , N n =1 x n / N ; Sample squared deviation: S , N n =1 ( x n - ¯ x ) 2 . ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - N ( μ - ¯ x ) 2 + S 2 σ 2 . 2 / 20
Maximum Likelihood Choose the parameter θ that maximizes P ( { x }| θ ). The Gaussian example: log likelihood ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - n ( x n - μ ) 2 2 σ 2 . Sample mean: ¯ x , N n =1 x n / N ; Sample squared deviation: S , N n =1 ( x n - ¯ x ) 2 . ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - N ( μ - ¯ x ) 2 + S 2 σ 2 . 2 / 20
Sufficient Statistics Let T ( { x n } N n =1 ) be some function of { x n } N n =1 : T ( { x n } N n =1 ) ↔ { x n } N n =1 θ implies I ( { x n } N n =1 ; θ ) I ( T ( { x n } N n =1 ); θ ) A sufficient statistics if: I ( { x n } N n =1 ; θ ) = I ( T ( { x n } N n =1 ); θ ); { x n } N n =1 T ( { x n } N n =1 ) θ is also a Markov chain; The likelihood P ( { x n } N n =1 | θ , T ( { x n } N n =1 )) is not a function of θ . In the Gaussian setting: (¯ x , S ) is a sufficient statistics of ( μ, σ 2 ) ln P ( { x n } N n =1 | μ, σ 2 ) = - N ln( 2 πσ ) - N ( μ - ¯ x ) 2 + S 2 σ 2 . 3 / 20
Sufficient Statistics Let T ( { x n } N n =1 ) be some function of { x n } N n =1 : T ( { x n } N n =1 ) ↔ { x n } N n =1 θ implies I ( { x n } N n =1 ; θ ) I ( T ( { x n } N n =1 ); θ ) A sufficient statistics if: I ( { x n } N n =1 ; θ ) = I ( T ( { x n } N n =1 ); θ ); { x n } N n =1 T ( { x n } N n =1 ) θ is also a Markov chain; The likelihood P ( { x n } N n =1 | θ , T ( { x n } N n =1

You've reached the end of your free preview.

Want to read all 29 pages?

• Fall '19

What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern