ml-lecture02

ml-lecture02 - Lecture 2: Classifcation. Perceptron....

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 2: Classifcation. Perceptron. Sigmoid classifers. Classifcation problems. Error Functions Perceptron Sigmoid classifers September 10, 2007 1 COMP-652 Lecture 2 Classifcation Given a data set D X Y where Y is a discrete set (usually with a smallish number oF values), fnd a hypothesis h H which predicts well the existing data IF Y has two possible values, e.g. Y = {- 1 , 1 } or Y = { , 1 } , this is called binary classifcation. Can we develop methods For classifcation as we did For regression? What does it take to develop a learning algorithm? September 10, 2007 2 COMP-652 Lecture 2 Recall: Three decisions What should be the error function? What should be the hypothesis class? How are we going to Fnd the best hypothesis in the class (the one that minimizes the error function)? September 10, 2007 3 COMP-652 Lecture 2 Error functions for binary classiFcation One worthy goal is to minimize the number of misclassified examples Suppose Y = {- 1 , 1 } and the hypotheses h w H also output a +1 or- 1 An example x , y is misclassiFed if yh w ( x ) is negative. So a reasonable error function is just counting the number of examples correctly classiFed: J ( w ) =- X i MisclassiFed y i h w ( x i ) This is called 0-1 loss This function is not differentiable, so often we will still use the mean-squared error. September 10, 2007 4 COMP-652 Lecture 2 Choosing the hypothesis class For regression, we used linear hypotheses (simple, nice) Is there an analogue for classication? What about linear hypotheses? September 10, 2007 5 COMP-652 Lecture 2 Example: Wisconsin data 10 15 20 25 30 0.2 0.4 0.6 0.8 1 tumor size (mm?) non ! recurring (0) / recurring (1) What is the meaning of the output in this case? September 10, 2007 6 COMP-652 Lecture 2 Output of a classiFer Useful predictions could be: The predicted class The probability that the example belongs to a given class Just applying linear regression as is gives us neither September 10, 2007 7 COMP-652 Lecture 2 Perceptron w 1 w 2 w n w x 1 x 2 x n x =1 . . . ! ! w i x i n i =0 1 if > 0-1 otherwise { o = ! w i x i n i =0 We can take a linear combination and threshold it: h w ( x ) = sgn ( w T x ) = 8 < : +1 if w T x >- 1 otherwise This is called a perceptron ....
View Full Document

Page1 / 14

ml-lecture02 - Lecture 2: Classifcation. Perceptron....

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online