# Predict matrix of predictors x

• Notes
• vimmi1
• 201

This preview shows page 178 - 179 out of 201 pages.

theta.predict <- function(fit,x){cbind(1,x)%*%fit\$coef} # matrix of predictorsX <- as.matrix(mydata[c("x1","x2","x3")])# vector of predicted valuesy <- as.matrix(mydata[c("y")]) results <- crossval(X,y,theta.fit,theta.predict,ngroup=10)cor(y, fit\$fitted.values)**2 # raw R2 cor(y,results\$cv.fit)**2 # cross-validated R2 Variable SelectionSelecting a subset of predictor variables from a larger set (e.g., stepwise selection) is a controversial topic. You can perform stepwise selection (forward, backward, both) using the stepAIC( ) function from the MASS package. stepAIC( ) performs stepwise model selection by exact AIC. # Stepwise Regressionlibrary(MASS)fit <- lm(y~x1+x2+x3,data=mydata)step <- stepAIC(fit, direction="both")step\$anova # display results Alternatively, you can perform all-subsets regression using the leaps( ) function from the leaps package. In the following code nbest indicates the number of subsets of each size to report. Here, the ten best models will be reported for each subset size (1 predictor, 2 predictors, etc.). # All Subsets Regressionlibrary(leaps)attach(mydata)leaps<-regsubsets(y~x1+x2+x3+x4,data=mydata,nbest=10)# view results summary(leaps)# plot a table of models showing variables in each model.# models are ordered by the selection statistic.plot(leaps,scale="r2")# plot statistic by subset size library(car)subsets(leaps, statistic="rsq") click to view Other options for plot( ) are bic, Cp, and adjr2. Other options for plotting with subset( ) are bic, cp, adjr2, and rss. Relative ImportanceThe relaimpo package provides measures of relative importance for each of the predictors in the model. See help(calc.relimp) for details on the four measures of relative importance provided. # Calculate Relative Importance for Each Predictorlibrary(relaimpo)calc.relimp(fit,type=c("lmg","last","first","pratt"),rela=TRUE)# Bootstrap Measures of Relative Importance (1000 samples) boot <- boot.relimp(fit, b = 1000, type = c("lmg", "last", "first", "pratt"), rank = TRUE, diff = TRUE, rela = TRUE)booteval.relimp(boot) # print resultplot(booteval.relimp(boot,sort=TRUE)) # plot result click to view Graphic Enhancements The car package offers a wide variety of plots for regression, including added variable plots, and enhanced diagnostic and scatter plots. Going FurtherNonlinear Regression The nls package provides functions for nonlinear regression. See John Fox's Nonlinear Regression and Nonlinear Least Squares for an overview. Huet and colleagues' Statistical Tools for Nonlinear Regression: A Practical Guide with S-PLUS and R Examples is a valuable reference book. Robust Regression There are many functions in R to aid with robust regression. For example, you can perform robust regression with the rlm( ) function in the MASS package. John Fox's (who else?) Robust Regression provides a good starting overview. The UCLA Statistical Computing website has Robust Regression Examples.
• • • 