*This preview shows
pages
1–10. Sign up
to
view the full content.*

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **Machine Learning Srihari 1 Parameter Estimation for Bayesian Networks Sargur Srihari [email protected] Machine Learning Srihari Topics Problem Statement Maximum Likelihood Approach Thumb-tack (Bernoulli), Multinomial, Gaussian Application to Bayesian Networks Bayesian Approach Thumbtack vs Coin Toss Uniform Prior vs Beta Prior Multinomial Dirichlet Prior 2 Machine Learning Srihari Problem Statement Network structure is fixed Data set consists of fully observed instances of the network variables D ={ [1],, [M]} Arises commonly in practice Since hard to elicit from human experts Forms building block for structure learning and learning from incomplete data 3 Machine Learning Srihari Two Main Approaches Maximum Likelihood Estimation Bayesian Start with the simplest context Bayesian network with a single variable Generalize to arbitrary network structures 4 Machine Learning Srihari Thumbtack Example Classical Statistics Problem Thumbtack is tossed to land head/tail Tossed several times obtaining either heads or tails Based on data set wish to estimate probability that next flip will land heads/tails Probability of heads is which is the unknown parameter 5 Machine Learning Srihari Results of Tosses Toss 100 times We get 35 heads What is our estimate for ? Intuition suggests 0.35 If it was 0.1 chances of 35/100 would be lower Law of larger number says as no of tosses grows it is increasingly unlikely that fraction will be far from For large M fraction of heads is a good estimate with high probability How to formalize this intuition? 6 Machine Learning Srihari Maximum Likelihood Since tosses are independent, probability of sequence is P(H,T,T,H,H: )= (1- )(1- ) = 3 (1- ) 2 Probability depends on value of Define the likelihood function to be L( :&lt;H,T,T,H,H&gt;)=P(H,T,T,H,H: )= 3 (1- ) 2 Parameter values with higher likelihood are more likely to generate the observed sequences Thus likelihood is measure of parameter quality =0.6 Maximizes likelihood of sequence HTTHH Machine Learning Srihari MLE for General Case If there are M[1] heads and M[0] tails Likelihood function is L( : D )= M[1] (1- ) M(0) Log-likelihood is l( : D )=M[1]log +M[0]log(1- ) Differentiating and setting equal to zero 8 = M [1] M [1] + M [0] Machine Learning Srihari Maximum Likelihood Principle Observe random variables from unknown distribution P*( ) Training samples are D ={ [1],, [M]} We are given a parametric model defined by P( ; ) , we wish to estimate parameters For each choice of parameter...

View
Full
Document