Lecture 24: Perceptrons Prof. Julia Hockenmaier [email protected] http://cs.illinois.edu/fa11/cs440 CS440/ECE448: Intro to ArtiFcial Intelligence

Regression
Linear regression Given some data {(x,y)…}, with x, y ˥ R, fnd a Function F(x) = w 1 x + w 0 such that F(x) = y. o o o o o

Squared Loss We want to fnd a weight vector w which minimizes the loss (error) on the training data {(x 1 ,y 1 )…(x N , y N )} 4 CS440/ECE448: Intro AI L ( w ) = L 2 ( f w ( x i ), i = 1 N ! y i ) = ( y i " f w ( x i ) i = 1 N ! ) 2
Linear regression We need to minimize the loss on the training data: w = argmin w Loss(f w ) We need to set partial derivatives of Loss(f w ) with respect to w1, w0 to zero. This has a closed-form solution for linear regression (see book).

Gradient descent In general, we won ʼ t be able to fnd a closed- Form solution, so we need an iterative (local search) algorithm. We will start with an initial weight vector w, and update each element iteratively in the direction oF its gradient: w i := w i ! d/dw i Loss( w ) 6 CS440/ECE448: Intro AI
Binary classifcation with Naïve Bayes For each item x = (x 1…. x d ) , we compute f k ( x ) = P( x | C k )P(C k ) = P(C k ) " i P(x i |C k ) for both class C 1 and C 2 We assign class C 1 to x if f 1 ( x ) > f 2 ( x ) Equivalently, we can de±ne a ʻ discriminant function ʼ f( x ) = f 1 ( x ) - f 2 ( x ) and assign class C 1 to x if f( x ) > 0 7 CS440/ECE448: Intro AI

