0 04 svi scatterplot matrix of the prostate cancer

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ooo o o oooooo o o o ooo o o oo o oo 0 oo ooo o o o oo o o o ooo oo oo o o o o o ooo o o ooo oo oo o ooo oo o o −1 oooo ooooooo oooo o o o oooooo ooo o o oo ooo o ooo 2 3 o o o o o o o o o o o o o o 100 9.0 o 7.0 ooo o o 8.0 o o o oo o o oo o o oo oo o o o oo o o oo o o oo o o oooo o o oo o o oo o o ooo oo o o oo ooooooooo o o o o oooo oo o o o oo o o o o o o o o 60 oo ooooooooo ooo o o oo oooo o o oooo o o oo pgg45 0 20 0.0 0.4 svi Scatterplot matrix of the Prostate Cancer data. The top variable is the response lpsa, and the top row shows its relationship with each of the others. o 0 20 60 100 17 ESL Chapter 3 — Linear Methods for Regression Trevor Hastie and Rob Tibshirani > l m f i t = lm ( l p s a∼. , data = l p r o s t a t e ) > summary ( l m f i t ) Call : lm ( formula = l p s a ∼ . , data = l p r o s t a t e ) Coefficients : E s t i m a t e S t d . E r r o r t v a l u e Pr ( > | t | ) ( I n t e r c e p t ) 0.180899 1.320601 0.137 0.89136 lcavol 0.564355 0.087831 6 . 4 2 5 6 . 5 4 e −09 lweight 0.622081 0.200892 3.097 0.00263 a ge − 0.021248 0 . 0 1 1 0 8 4 − 1.917 0 . 0 5 8 4 8 lbph 0.096676 0.057915 1.669 0.09862 svi 0.761652 0.241173 3.158 0.00218 lcp − 0.106055 0 . 0 8 9 8 6 6 − 1.180 0 . 2 4 1 1 2 gleason 0.049287 0.155340 0.317 0.75178 pgg45 0.004458 0.004365 1.021 0.30999 ∗∗∗ ∗∗ . . ∗∗ R e s i d u a l s t a n d a r d e r r o r : 0 . 6 9 9 5 on 88 d e g r e e s o f f r e e d o m Multiple R squared : 0.6634 , − Adjusted R squared : 0.6328 − F− s t a t i s t i c : 2 1 . 6 8 on 8 and 88 DF , p−v a l u e : < 2 . 2 e −16 18 ESL Chapter 3 — Linear Methods for Regression Trevor Hastie and Rob Tibshirani The woes of (interpreting) regression coefficients “Data analysis and regression” Mosteller and Tukey 1977 • regression coefficient βj estimates the expected change in y per unit change in xj , with all other predictors held fixed. But predictors usually change together! • Example: y total amount of change in your pocket; x1 = # of coins; x2 = # of pennies, nickels and dimes. By itself, regression coefficient of y on x2 will be > 0. But how about with x1 in model? • y = number of tackles by a football player in a season, w, h are his height and weight. Fitted regression model is y = b0 + .50w − .10h. ˆ ˆ How do we interpret β2 < 0? 19 ESL Chapter 3 — Linear Methods for Regression Trevor Hastie and Rob Tibshirani The Bias-variance tradeoff A good measure of the quality of an estimator ˆ(x) is the mean squared f error. Let f0 (x) be the true value of f (x) at the point x. Then Mse [ˆ(x)] = E [ˆ(x) − f0 (x)]2 f f This can be written as Mse [ˆ(x)] = Var [ˆ(x)] + [E ˆ(x) − f0 (x)]2 f f f This is variance plus squared bias. Typically, when bias is low, variance is high and vice-versa. Choosing estimators often involves a tradeoff between bias and variance. 20 ESL Chapter 3 — Linear Methods for Regression Trevor Hastie and Rob Tibshirani • If the linear mo...
View Full Document

Ask a homework question - tutors are online