{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Chp3 - Copy

# Chp3 - Copy - 3 Linear Methods for Regression 3.1...

This preview shows pages 1–3. Sign up to view the full content.

3 Linear Methods for Regression 3.1 Introduction A linear regression model assumes that the regression function E( Y | X ) is linear in the inputs X 1 , . . . , X p . Linear models were largely developed in the precomputer age of statistics, but even in today’s computer era there are still good reasons to study and use them. They are simple and often provide an adequate and interpretable description of how the inputs affect the output. For prediction purposes they can sometimes outperform fancier nonlinear models, especially in situations with small numbers of training cases, low signal-to-noise ratio or sparse data. Finally, linear methods can be applied to transformations of the inputs and this considerably expands their scope. These generalizations are sometimes called basis-function methods, and are discussed in Chapter 5. In this chapter we describe linear methods for regression, while in the next chapter we discuss linear methods for classification. On some topics we go into considerable detail, as it is our firm belief that an understanding of linear methods is essential for understanding nonlinear ones. In fact, many nonlinear techniques are direct generalizations of the linear methods discussed here. © Springer Science+Business Media, LLC 2009 T. Hastie et al., The Elements of Statistical Learning, Second Edition, 43 DOI: 10.1007/b94608_3,

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
44 3. Linear Methods for Regression 3.2 Linear Regression Models and Least Squares As introduced in Chapter 2, we have an input vector X T = ( X 1 , X 2 , . . . , X p ), and want to predict a real-valued output Y . The linear regression model has the form f ( X ) = β 0 + p j =1 X j β j . (3.1) The linear model either assumes that the regression function E( Y | X ) is linear, or that the linear model is a reasonable approximation. Here the β j ’s are unknown parameters or coeﬃcients, and the variables X j can come from different sources: quantitative inputs; transformations of quantitative inputs, such as log, square-root or square; basis expansions, such as X 2 = X 2 1 , X 3 = X 3 1 , leading to a polynomial representation; numeric or “dummy” coding of the levels of qualitative inputs. For example, if G is a five-level factor input, we might create X j , j = 1 , . . . , 5 , such that X j = I ( G = j ). Together this group of X j repre- sents the effect of G by a set of level-dependent constants, since in 5 j =1 X j β j , one of the X j s is one, and the others are zero. interactions between variables, for example, X 3 = X 1 · X 2 . No matter the source of the X j , the model is linear in the parameters. Typically we have a set of training data ( x 1 , y 1 ) . . . ( x N , y N ) from which to estimate the parameters β . Each x i = ( x i 1 , x i 2 , . . . , x ip ) T is a vector of feature measurements for the i th case. The most popular estimation 0 1 p T to minimize the residual sum of squares RSS( β ) = N i =1 ( y i f ( x i )) 2 = N i =1 y i β 0 p j =1 x ij β j 2 . (3.2) From a statistical point of view, this criterion is reasonable if the training observations ( x i , y i ) represent independent random draws from their popu- lation. Even if the x i
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 57

Chp3 - Copy - 3 Linear Methods for Regression 3.1...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online