R-VariableSelection

Stat 425 Variable Selection

Stat 425 Variable Selection > loc <- "http://www.stat.umn.edu/~sandy/courses/8051/data/hald.txt" > hald <- read.table(url(loc), header = TRUE) > summary(m1 <- lm(Y ~ X1 + X2 + X3 + X4, hald)) Call: lm(formula = Y ~ X1 + X2 + X3 + X4, data = hald) Residuals: Min 1Q Median 3Q Max -3.1750 -1.6709 0.2508 1.3783 3.9254 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 62.4054 70.0710 0.891 0.3991 X1 1.5511 0.7448 2.083 0.0708 X2 0.5102 0.7238 0.705 0.5009 X3 0.1019 0.7547 0.135 0.8959 X4 -0.1441 0.7091 -0.203 0.8441 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.446 on 8 degrees of freedom Multiple R-squared: 0.9824, Adjusted R-squared: 0.9736 F-statistic: 111.5 on 4 and 8 DF, p-value: 4.756e-07 >summary(m2 <- lm(Y ~ X1 + X2, hald)) Call: lm(formula = Y ~ X1 + X2, data = hald) Residuals: Min 1Q Median 3Q Max -2.893 -1.574 -1.302 1.362 4.048 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 52.57735 2.28617 23.00 5.46e-10 *** X1 1.46831 0.12130 12.11 2.69e-07 *** X2 0.66225 0.04585 14.44 5.03e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.406 on 10 degrees of freedom Multiple R-squared: 0.9787, Adjusted R-squared: 0.9744 F-statistic: 229.5 on 2 and 10 DF, p-value: 4.407e-09 # Calculate AIC > extractAIC(m1) [1] 5.00000 26.94429 > extractAIC(m2) [1] 3.00000 25.41999

