{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Chp4 - Copy

# Chp4 - Copy - 4 Linear Methods for Classification 4.1...

This preview shows pages 1–3. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 4 Linear Methods for Classification 4.1 Introduction In this chapter we revisit the classification problem and focus on linear methods for classification. Since our predictor G ( x ) takes values in a dis- crete set G , we can always divide the input space into a collection of regions labeled according to the classification. We saw in Chapter 2 that the bound- aries of these regions can be rough or smooth, depending on the prediction function. For an important class of procedures, these decision boundaries are linear; this is what we will mean by linear methods for classification. There are several different ways in which linear decision boundaries can be found. In Chapter 2 we fit linear regression models to the class indicator variables, and classify to the largest fit. Suppose there are K classes, for convenience labeled 1 , 2 , . . . , K , and the fitted linear model for the k th indicator response variable is ˆ f k ( x ) = ˆ β k + ˆ β T k x . The decision boundary between class k and is that set of points for which ˆ f k ( x ) = ˆ f ( x ), that is, the set { x : ( ˆ β k − ˆ β ) + ( ˆ β k − ˆ β ) T x = 0 } , an aﬃne set or hyperplane 1 Since the same is true for any pair of classes, the input space is divided into regions of constant classification, with piecewise hyperplanar decision boundaries. This regression approach is a member of a class of methods that model discriminant functions δ k ( x ) for each class, and then classify x to the class with the largest value for its discriminant function. Methods 1 Strictly speaking, a hyperplane passes through the origin, while an aﬃne set need not. We sometimes ignore the distinction and refer in general to hyperplanes. © Springer Science+Business Media, LLC 2009 T. Hastie et al., The Elements of Statistical Learning, Second Edition, 101 DOI: 10.1007/b94608_4, 102 4. Linear Methods for Classification that model the posterior probabilities Pr( G = k | X = x ) are also in this class. Clearly, if either the δ k ( x ) or Pr( G = k | X = x ) are linear in x , then the decision boundaries will be linear. Actually, all we require is that some monotone transformation of δ k or Pr( G = k | X = x ) be linear for the decision boundaries to be linear. For example, if there are two classes, a popular model for the posterior proba- bilities is Pr( G = 1 | X = x ) = exp( β + β T x ) 1 + exp( β + β T x ) , Pr( G = 2 | X = x ) = 1 1 + exp( β + β T x ) . (4.1) Here the monotone transformation is the logit transformation: log[ p/ (1 − p )], and in fact we see that log Pr( G = 1 | X = x ) Pr( G = 2 | X = x ) = β + β T x. (4.2) The decision boundary is the set of points for which the log-odds are zero, and this is a hyperplane defined by x | β + β T x = 0 . We discuss two very popular but different methods that result in linear log-odds or logits: linear discriminant analysis and linear logistic regression. Although they differ in their derivation, the essential difference between them is in the way the...
View Full Document

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern