This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CSC 411 / CSC D11 Linear Regression 2 Linear Regression In regression, our goal is to learn a mapping from one realvalued space to another. Linear re gression is the simplest form of regression: it is easy to understand, often quite effective, and very efficient to learn and use. 2.1 The 1D case We will start by considering linear regression in just 1 dimension. Here, our goal is to learn a mapping y = f ( x ) , where x and y are both realvalued scalars (i.e., x ∈ R , y ∈ R ). We will take f to be an linear function of the form: y = wx + b (1) where w is a weight and b is a bias . These two scalars are the parameters of the model, which we would like to learn from training data. n particular, we wish to estimate w and b from the N training pairs { ( x i , y i ) } N i =1 . Then, once we have values for w and b , we can compute the y for a new x . Given 2 data points (i.e., N=2), we can exactly solve for the unknown slope w and offset b . (How would you formulate this solution?) Unfortunately, this approach is extremely sensitive to noise in the training data measurements, so you cannot usually trust the resulting model. Instead, we can find much better models when the two parameters are estimated from larger data sets. When N > 2 we will not be able to find unique parameter values for which y i = wx i + b for all i , since we have many more constraints than parameters. The best we can hope for is to find the parameters that minimize the residual errors, i.e., y i − ( wx i + b ) . The most commonlyused way to estimate the parameters is by leastsquares regression . We define an energy function (a.k.a. objective function): E ( w, b ) = N summationdisplay i =1 ( y i − ( wx i + b )) 2 (2) To estimate w and b , we solve for the w and b that minimize this objective function. This can be done by setting the derivatives to zero and solving.done by setting the derivatives to zero and solving....
View
Full
Document
This note was uploaded on 11/09/2010 for the course CS CSCD11 taught by Professor Davidfleet during the Spring '10 term at University of Toronto Toronto.
 Spring '10
 DavidFleet
 Machine Learning

Click to edit the document details