Chapter11

Chapter11 - Multicollinearity STAT 563 Spring 2007 Recap...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Multicollinearity STAT 563 Spring 2007 Recap • When the predictors are correlated or express near-linear dependencies, we face the problem of multicollinearity • The primary sources of multicollinearity – The data collection method employed – Constraints on the model or in the population – Model specification – An overdefined model Effects of multicollinearity • Strong multicollinearity results in large variances and covariances for the least squares estimates • Recall under unit length scaling, the diagonal elements of the C=(X’X)-1 can be written as • Where R j 2 is the coefficient of determination from the regression of X j on the remaining p-1 predictors • C jj is also called the Variance Inflation Factors (VIF) p j R C j jj ,...., 2 , 1 , 1 1 2 =- = VIF • To note the role of VIF in estimation, write the squared distance from to the true parameter vector β as • Taking the expected value β ˆ ) ˆ ( )' ˆ ( 2 1 β β β β -- = L ∑ ∑ ∑ =- = = = = =- =-- = p j j p j j j p j j VIF X X Tr Var E E L E 1 2 1 2 1 2 1 2 1 ) ' ( ) ˆ ( ) ˆ ( ) ˆ ( )' ˆ ( ) ( σ σ β β β β β β β Note • Note that E(L 1 2 ), which is p σ 2 for orthogonal system of predictors, is greatly magnified by large VIFs • Because trace of a matrix is also the sum of the eigenvalues, we can write • Where λ j >0, j=1,2,…..,p are the eigenvalues of X’X. • If multicollinearity exists, then some λ j will be smaller making the expectation larger. ∑ = = p j j L E 1 2 2 1 1 ) ( λ σ Indicators of collinearity • Rule of thumb: – Simple correlation r ij > 0.95 – Variance inflation factors, VIF > 10 Body fat data • Recall the VIFs were (708.8, 564.3, 104.6) associated with the predictors SKIN, THIGH and ARM • That is, R 2 of (0.999, 0.998, 0.990) for regressing each of the predictors on the remaining two predictors Body fat data More on eigenvalues • Recall that, if A is a pxp symmetric matrix with n eigen values λ 1 , λ 2 , …, λ p and p associated eigen vectors v 1 ,v 2 ,….,v p , then the following algebraic relationship is true Av j = λ j v j , j=1,2,….,p • Suppose an eigenvalue of W’W, while not exactly zero, is very close to zero • Since all the elements of a eigen vector are less than 1.0 in magnitude, λ j v j ≈ for small eigen values Eigenvalues • It now follows that W’Wv j = λ j v j ≈ for any eigen vector whose corresponding eigen value is nearly zero • Pre-multiply the above equation by v j ’ and get v j ’W’Wv j ≈ 0. • Now let U =Xv j making the above equation U ’U =0= Σ U r 2 Eigenvalues • Since all terms in the summation are non-negative, U ’U =0 implies U =0 . • Thus if λ j ≈ 0 (near collinearity), then • Implication from the above equation is that the elements of eigen vectors corresponding to small eigen values provide coefficients that define multicollinearities among the predictor variables – The larger elements in the eigen vector will indicator which variables are involved in the collinearity 1 ≈ = ∑ = r p r rj j W v v W Fat Data smallest Eigenvector corresponding to the smallest eigenvalue Fat Data Fat data...
View Full Document

This note was uploaded on 03/08/2009 for the course 960 563 taught by Professor Unknown during the Spring '07 term at Rutgers.

Page1 / 71

Chapter11 - Multicollinearity STAT 563 Spring 2007 Recap...

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online