When x x i ie x is orthonormal and n p ˆ β denotes

Info icon This preview shows pages 59–62. Sign up to view the full content.

View Full Document Right Arrow Icon
When X X = I (i.e., X is orthonormal) and n > p : ( ˆ β denotes OLS estimates.) ˆ β λ = ˆ β 1 + λ , This form illustrates the essential feature of ridge regression: shrinkage towards zero The ridge penalty introduces bias but reduces the the variance of the estimate (bias- variance tradeoff). As λ 0, ˆ β λ ˆ β , whereas λ → ∞ , ˆ β λ 0 . Theorem. If ǫ i ’s are i.i.d. zero mean with variance Var( ǫ i ) = σ 2 , then the bias of the ridge regression estimator is Bias( ˆ β λ ) = E [ ˆ β λ ] β = λR λ β and the variance-covariance matrix is V ar ( ˆ β λ ) = E [( ˆ β E [ ˆ β ])( ˆ β E [ ˆ β ]) ] = σ 2 R λ ( X X ) R λ , where R λ = ( X X + λ I ) 1 . Recall that mean squared error (MSE) = variance + (bias) 2 , and in multiparameter problems: MSE ( ˆ β λ ) = E [( ˆ β λ β )( ˆ β λ β ) ] = V ar ( ˆ β λ ) + Bias ( ˆ β λ ) · [ Bias ( hatβ λ )] which also implies that “total MSE”: p summationdisplay i =1 MSE ( ˆ β λ,i ) = p summationdisplay i =1 V ar ( ˆ β λ,i ) bracehtipupleft bracehtipdownrightbracehtipdownleft bracehtipupright as λ + p summationdisplay i =1 [ Bias ( ˆ β λ,i )] 2 bracehtipupleft bracehtipdownrightbracehtipdownleft bracehtipupright as λ There always exists λ such that the total MSE of ˆ β λ is smaller than the total MSE of the least squares estimate ˆ β . Q. How to choose the shrinkage parameter λ ? PAGE 59
Image of page 59

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2.9 LASSO (Least Absolute Shrinkage and Selection Operator) c circlecopyrt HYON-JUNG KIM, 2017 Hoerl and Kennard (1970) suggested using ridge traces : - Plot the components of ˆ β λ against λ or rather DF( λ ) due to 1-to-1 relation. - Choose a λ for which the coefficients are not rapidly changing and have sensible signs. This is a heuristic approach (requiring extensive computations for different choices of λ ) and can be criticized from many perspectives. Nowadays, more standard approach is to use cross-validation (CV) of information theoretic criterions such as Bayesian Information Criterion (BIC). DF Coefficient 0 2 4 6 8 10 age sex bmi map tc ldl hdl tch ltg glu Ridge Regression Coefficient Paths Regression with a straight line with a piecewise linear function 2.9 LASSO (Least Absolute Shrinkage and Selection Operator) Another way to deal with variable selection is to use regularization (or penalization) - regu- larize the regression coefficients. LASSO (Tibshirani,1996) has been a popular technique for simultaneous linear regression estimation and variable selection. Specifically, we define ˆ β to minimize the penalized sums of squares bardbl Y bardbl 2 + λ bardbl β bardbl 1 Or min θ n summationdisplay i =1 ( Y i X i β ) 2 subject to bardbl β bardbl 1 s PAGE 60
Image of page 60
2.9 LASSO (Least Absolute Shrinkage and Selection Operator) c circlecopyrt HYON-JUNG KIM, 2017 As before (in the ridge regression), the shrinkage (penalty) parameter λ offers a tradeoff between the regularizing constraint and minimization of the sum of squared residuals. The bigger the λ the greater is the amount of shrinkage. With LASSO, some of the estimated coefficients are shrunk all the way to zero. Another important role of λ is for automated variable selection.
Image of page 61

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 62
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern