8.4.2-LearningAsOptimization

8.4.2-LearningAsOptimization - Machine Learning ! ! ! ! !...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Machine Learning Srihari 1 Learning as Optimization Sargur Srihari srihari@cedar.buffalo.edu
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Machine Learning Srihari Topics in Learning as Optimization • Evaluation of Learned Model • Empirical Risk and Overfitting – Bias vs. Variance Trade-off – Design and Evaluation of Learning Procedures – Goodness of Fit – PAC bounds • Discriminative versus Generative Training • Learning Tasks Model Constraints Data Observability Taxonomy of Learning Tasks 2
Background image of page 2
Machine Learning Srihari Optimization Viewpoint • We have: 1. Hypothesis Space • Set of candidate models 2. Objective Function • Criterion for quantifying preference over models • Learning Task – Find a high-scoring model within model class Learning as optimization is predominant approach 3 e.g., set of all BNs or MNs given a set of variables
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Machine Learning Srihari Criteria for Optimization • Numerical Criteria (Loss functions) we optimize 1. Density Estimation • Relative Entropy or K-L Divergence – Expected value of log-difference – Equivalent to Empirical Risk over instances D » Called Empirical log-loss 2. Classification • Classification error: 0/1 loss • Hamming loss • Conditional log-likelihood • Can view learning as optimization 4 D ( P * || P ) = E ξ P * log P * ( ) P ( ) 1 | D | log P ( [ m ] : M ) m = 1 M E ( x , y )~ P [ I { h P ( x ) y }] E ( x , y )~ P * log P ( y | x ) [ ]
Background image of page 4
Machine Learning Srihari Why discuss Optimization? • Different choices of Objective Functions – Have ramification to results of learning procedures • Has implications to further discussions on learning 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Machine Learning Srihari Empirical Distribution & Log-loss • We have an unknown distribution P* • Empirical Distribution where ξ [1], [2]… is a sequence of iid samples from P* – For a sufficiently large training set P D , will be quite close to P* • Consider empirical log-loss – Can be shown that distribution that maximizes likelihood of data set D is the empirical distribution 6 ˆ P D ( A ) = 1 M I { [ m ] A } m lim M →∞ ˆ P D M ( A ) = P * ( A ) Probability of Event A is the fraction of samples that satisfy A 1 | D | log P ( [ m ] : M ) m = 1 M
Background image of page 6
Machine Learning Srihari Data Requirements • How many data samples are needed? • Consider two cases 1. 100 binary random variables 2. Bayesian network with where a node has k parents • In both we will see that the hypothesis space is too large 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Machine Learning Srihari Data Requirement with Binary Variables • Consider 100 binary variables 2 100 possible joint assignments D has 1,000 instances (most likely distinct) – probability of 0.001 to each assignment 0 to remaining 2 100 -1000 • Example is extreme, but phenomenon is general 8
Background image of page 8
Machine Learning Srihari Data Requirement with Bayesian network M* is a Bayesian network with a variable ‘Fever’ Fever has many parents (diseases)
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 35

8.4.2-LearningAsOptimization - Machine Learning ! ! ! ! !...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online