--- title: 'STAT 420: Homework 10' author: "Pengyu Chen, NetID:pengyu2" output: html_document: theme: readable toc: yes --- {r setup, echo = FALSE, message = FALSE, warning = FALSE} options(scipen = 1, digits = 4, width = 80)  # Assignment ## Exercise 1 (longley Macroeconomic data) The data set longley from the faraway package contains macroeconomic data for predicting employment. {r} library(faraway)  {r, eval = FALSE} View(longley) ?longley  **(a)** Find the correlation between each of the variables in the dataset. {r} round(cor(longley),2)  **(b)** Fit a model with Employed as the response and the remaining variables as predictors. Calculate the variance inflation factor for each of the predictors. What is the largest VIF? Do any of the VIFs suggest multicollinearity? {r} emp_model = lm(Employed ~ ., data = longley) vif(emp_model)  - The largest VIF is GNP with the value of 1788.51348. - The VIFs for every predictor except for Armed.Forces are greater than 5, which suggest a huge multicollinearity issue. **(c)** What proportion of observed variation in Population is explained by a linear relationship with the other predictors? {r} pop_model_small = lm(Population ~ . - Employed, data = longley) summary(pop_model_small)$r.squared  - 99.75% of the variation of Population is explained by a linear relationship with the other predictors. **(d)** Calculate the partial correlation coefficient for Population and Employed **with the effects of the other predictors removed**. {r} emp_model_small = lm(Employed ~ . - Population, data = longley) cor(resid(pop_model_small), resid(emp_model_small))  This preview has intentionally blurred sections. Sign up to view the full version. **(e)** Fit a new model with Employed as the response and the predictors from the model in **(b)** which were significant. (Use$\alpha = 0.05\$.) Calculate the variance inflation factor for each of the predictors. What is the largest VIF? Do any of the VIFs suggest multicollinearity? {r} summary(emp_model)  We notice that Unemployed, Armed.Forces, and Year are significant in the full model. Use them as the predictors in the new model.
Spring '08
STEPANOV

