notes13 - Chapter 8 Timothy Hanson Department of...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 8 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 23 8.1 Polynomial regression Used when the relationship between Y and the predictor(s) is curvilinear. Example : we might add a quadratic term to a simple linear model to get a parabolic mean Y i = + 1 x i 1 + 11 x 2 i 1 + i . We can no longer interpret 1 and 11 as usual. Cannot hold x 1 constant and increase x 2 1 by one unit, or vice-versa! Adding higher order terms in PROC REG is a pain; it needs to be done in the DATA step. For PROC GLM, you can specify a model such as model outcome=age chol age*age age*chol; directly. 2 / 23 Higher degree polynomials The degree of a polynomial is the largest power the predictor is raised to. The previous model is a 2nd degree polynomial giving a quadratic-shaped mean function. Here is a third-order (cubic) in one predictor: Y i = + 1 x i 1 + 11 x 2 i 1 + 111 x 3 i 1 i . A polynomial f ( x ) = + 1 x + 2 x 2 + + k x k can have up to k- 1 turning points or extrema. (p. 296) A k- 1th-order polynomial can go through ( x 1 , Y 1 ) ,..., ( x k , Y k ) exactly ! 3 / 23 General notes on fitting (p. 295) Predictors can be first centered by subtracting off the sample mean from each predictor, i.e. x * ij = x ij- x j is used as a predictor instead of x ij where x j = n- 1 n i =1 x ij . This reduces multicollinearity among, for example, x i 1 , x 2 i 1 , x 3 i 1 , etc. This isnt always necessary. Polynomials of degree 4 (quartic) and higher should rarely be used; cubic and lower is okay. High-degree polynomials have unwieldy behavior and can provide extremely poor out of sample prediction. Extrapolation particularly dangerous (p. 294). A better option is to fit an additive model (discussed later); the degrees of freedom on the smoothers can mimic third, fourth degree polynomials while being better behaved. 4 / 23 Polynomial regression: more than one predictor In the case of multiple predictors with quadratic terms, cross-product terms should also be included, at least initially. Example : Quadratic regression, two predictors: Y i = + 1 x i 1 + 2 x i 2 | {z } 1st order + 11 x 2 i 1 + 22 x 2 i 2 + 12 x i 1 x i 2 | {z } 2nd order + i . This is an example of a response surface , or parabolic surface (Chapter 30!) Hierarchical model building, (p. 299) stipulates that a model containing a particular term should also contain all terms of lower order including the cross-product terms. Degree of cross-product term is obtained by summing power for each predictor. e.g. the degree of 1123 x 2 i 1 x i 2 x i 3 is 2 + 1 + 1 = 4. 5 / 23 Hierarchical model building When using a polynomial regression model as an approximation to the true regression function, statisticians will often fit a second-order or third-order model and then explore whether a lower-order model is adequate...With the hierarchical approach, if a polynomial term of a given order is retained, then all related terms of lower order are also...
View Full Document

This note was uploaded on 12/14/2011 for the course STAT 704 taught by Professor Staff during the Fall '11 term at South Carolina.

Page1 / 23

notes13 - Chapter 8 Timothy Hanson Department of...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online