This preview shows page 1. Sign up to view the full content.
Unformatted text preview: CS221 Discussion Section October 12, 2007 October 12, 2007 1 Least squares regression
Training set: {. . . , (x(i), y (i)), . . .}m i=1 Parameters: θ ∈ Rn Hypothesis: hθ (x) = Parameter estimation:
m n j =1 θj xj Minimize J (θ ) =
i=1 (hθ (x(i)) − y (i))2 October 12, 2007 2 Estimating the bias of a coin
On each coin toss: P (head) = p. P (tail) = 1 − p. Experiment: m independent tosses with h heads. What is a good estimate for the bias of the coin?? October 12, 2007 3 Maximum likelihood estimation
Maximum likelihood principle Estimate parameters to make data as likely as possible. θMLE = arg maxθ P (data; θ ) For the coin example P (h heads in m tosses) = m h · ph(1 − p)m−h Applying the maximum likelihood principle: pMLE = arg max
p m h · ph(1 − p)m−h h = m
October 12, 2007 4 Least squares regression
Parameter estimation: Minimize J (θ ) = Consider the following model: P (y  x ; θ ) = 1 (y − θ T x )2 √ exp − 2σ 2 2πσ
N (µ=θT x, σ ) m ( i) i=1 (hθ (x ) − y ( i ) )2 Maximum likelihood estimate: θMLE = arg maxθ P (y (1), . . . , y (m)x(1), . . . , x(m); θ ) October 12, 2007 5 Least squares regression
Parameter estimation: Minimize J (θ ) = Consider the following model: P (y  x ; θ ) = 1 (y − θ T x )2 √ exp − 2σ 2 2πσ
N (µ=θT x, σ ) m ( i) i=1 (hθ (x ) − y ( i ) )2 Maximum likelihood estimate: θMLE = arg max
θ P (y (1), . . . , y (m)x(1), . . . , x(m); θ )
m = arg max
θ log
i=1 m P (y ( i )  x ( i ) ; θ ) = arg max
θ i=1 log P (y (i)x(i); θ )
October 12, 2007 6 Logistic regression
Classiﬁcation: y ∈ {0, 1} = = hθ (x) = g (θ T x ) = y · log hθ (x) + (1 − y ) · log(1 − hθ (x)) P (y (1), . . . , y (m)x(1), . . . , x(m); θ )
m 1 1+exp(−θT x) log P (y x; θ ) θMLE P (y = 1x; θ ) Maximum likelihood estimate: = arg max
θ = arg max
θ log P (y (1), . . . , y (m)x(1), . . . , x(m); θ )
i=1 m = arg max
θ log P (y ( i )  x ( i ) ; θ ) = arg max
θ i=1 m log P (y (i)x(i); θ ) y (i) · log hθ (x(i)) + (1 − y (i)) · log(1 − hθ (x(i)))
October 12, 2007 7 = arg max
θ i=1 ...
View
Full
Document
This note was uploaded on 11/30/2009 for the course CS 221 taught by Professor Koller,ng during the Winter '09 term at Stanford.
 Winter '09
 KOLLER,NG
 Artificial Intelligence

Click to edit the document details