ex3_slides

ex3_slides - CS221 Discussion Section October 12, 2007...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS221 Discussion Section October 12, 2007 October 12, 2007 1 Least squares regression Training set: {. . . , (x(i), y (i)), . . .}m i=1 Parameters: θ ∈ Rn Hypothesis: hθ (x) = Parameter estimation: m n j =1 θj xj Minimize J (θ ) = i=1 (hθ (x(i)) − y (i))2 October 12, 2007 2 Estimating the bias of a coin On each coin toss: P (head) = p. P (tail) = 1 − p. Experiment: m independent tosses with h heads. What is a good estimate for the bias of the coin?? October 12, 2007 3 Maximum likelihood estimation Maximum likelihood principle Estimate parameters to make data as likely as possible. θMLE = arg maxθ P (data; θ ) For the coin example P (h heads in m tosses) = m h · ph(1 − p)m−h Applying the maximum likelihood principle: pMLE = arg max p m h · ph(1 − p)m−h h = m October 12, 2007 4 Least squares regression Parameter estimation: Minimize J (θ ) = Consider the following model: P (y | x ; θ ) = 1 (y − θ T x )2 √ exp − 2σ 2 2πσ N (µ=θT x, σ ) m ( i) i=1 (hθ (x ) − y ( i ) )2 Maximum likelihood estimate: θMLE = arg maxθ P (y (1), . . . , y (m)|x(1), . . . , x(m); θ ) October 12, 2007 5 Least squares regression Parameter estimation: Minimize J (θ ) = Consider the following model: P (y | x ; θ ) = 1 (y − θ T x )2 √ exp − 2σ 2 2πσ N (µ=θT x, σ ) m ( i) i=1 (hθ (x ) − y ( i ) )2 Maximum likelihood estimate: θMLE = arg max θ P (y (1), . . . , y (m)|x(1), . . . , x(m); θ ) m = arg max θ log i=1 m P (y ( i ) | x ( i ) ; θ ) = arg max θ i=1 log P (y (i)|x(i); θ ) October 12, 2007 6 Logistic regression Classification: y ∈ {0, 1} = = hθ (x) = g (θ T x ) = y · log hθ (x) + (1 − y ) · log(1 − hθ (x)) P (y (1), . . . , y (m)|x(1), . . . , x(m); θ ) m 1 1+exp(−θT x) log P (y |x; θ ) θMLE P (y = 1|x; θ ) Maximum likelihood estimate: = arg max θ = arg max θ log P (y (1), . . . , y (m)|x(1), . . . , x(m); θ ) i=1 m = arg max θ log P (y ( i ) | x ( i ) ; θ ) = arg max θ i=1 m log P (y (i)|x(i); θ ) y (i) · log hθ (x(i)) + (1 − y (i)) · log(1 − hθ (x(i))) October 12, 2007 7 = arg max θ i=1 ...
View Full Document

This note was uploaded on 11/30/2009 for the course CS 221 taught by Professor Koller,ng during the Winter '09 term at Stanford.

Ask a homework question - tutors are online