lars_main - SLDM III c Hastie & Tibshirani -...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Least Angle Regression, Forward Stagewise and the Lasso Background • The topic in this section is linear regression • But the motivation comes from the area of flexible function fitting: “Boosting”— Freund & Schapire (1995) 1 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Least Squares Boosting Friedman, Hastie & Tibshirani — see Elements of Statistical Learning (chapter 10) Supervised learning: Response y , predictors x = (x1 , x2 . . . xp ). 1. Start with function F (x) = 0 and residual r = y 2. Fit a CART regression tree to r giving f (x) 3. Set F (x) ← F (x) + f (x), r ← r − f (x) and repeat step 2 many times 2 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Least Squares Boosting Prediction Error Single tree ag replacements =1 = .01 Number of steps 3 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Linear Regression Here is a version of least squares boosting for multiple linear regression: (assume predictors are standardized) (Incremental) Forward Stagewise 1. Start with r = y , β1 , β2 , . . . βp = 0. 2. Find the predictor xj most correlated with r 3. Update βj ← βj + δj , where δj = · sign r, xj 4. Set r ← r − δj · xj and repeat steps 2 and 3 many times δj = r, xj gives usual forward stagewise; different from forward stepwise Analogous to least squares boosting, with trees=predictors 4 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Prostate Cancer Data Lasso Forward Stagewise lcavol 0.6 0.4 0.2 svi lweight pgg45 lbph 0.0 gleason Coefficients 0.4 0.2 0.0 gleason age age -0.2 lacements svi lweight pgg45 lbph -0.2 Coefficients 0.6 lcavol lcp 0.0 0.5 1.0 P 1.5 2.0 t = j |βj | 2.5 lcp 0 50 100 150 200 Iteration 250 5 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Linear regression via the Lasso (Tibshirani, 1995) • Assume y = 0, xj = 0, Var(xj ) = 1 for all j . ¯ ¯ • Minimize i (yi − j xij βj )2 subject to j |βj | ≤ s • With orthogonal predictors, solutions are soft thresholded version of least squares coefficients: ˆ ˆ sign(βj )(|βj | − γ )+ (γ is a function of s) • For small values of the bound s, Lasso does variable selection. See pictures 6 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm More on Lasso • Implementations use quadratic programming to compute solutions • Can be applied when p > n. In that case, number of non-zero coefficients is at most n − 1 (by convex duality) • interesting consequences for applications, eg microarray data 7 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm 8 Diabetes Data Lasso Stagewise 3 6 9 500 500 9 3 6 •• ˆ βj 39 •• 47 • •• 2 10 5 8 7 10 1 • •• 8 61 4 8 7 10 1 0 0 4 •• 39 •• 47 • •• 2 10 5 •• • 81 6 2 -500 lacements -500 2 5 Pˆ |βj | → 0 1000 P2000 ˆ t= |βj | → 3000 5 0 1000 P2000 ˆ t= |βj | → 3000 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Why are Forward Stagewise and Lasso so similar? • Are they identical? • In orthogonal predictor case: yes • In hard to verify case of monotone coefficient paths: yes • In general, almost! • Least angle regression (LAR) provides answers to these questions, and an efficient way to compute the complete Lasso sequence of solutions. 9 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Least Angle Regression — LAR Like a “more democratic” version of forward stepwise regression. ˆˆ ˆ 1. Start with r = y , β1 , β2 , . . . βp = 0. Assume xj standardized. 2. Find predictor xj most correlated with r. 3. Increase βj in the direction of sign(corr(r, xj )) until some other competitor xk has as much correlation with current residual as does xj . ˆˆ 4. Move (βj , βk ) in the joint least squares direction for (xj , xk ) until some other competitor x has as much correlation with the current residual 5. Continue in this way until all predictors have been entered. Stop when corr(r, xj ) = 0 ∀ j , i.e. OLS solution. 10 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm x2 x2 ag replacements ¯ y2 u2 ˆ µ0 ˆ µ1 x1 ¯ y1 The LAR direction u2 at step 2 makes an equal angle with x1 and x2 . 11 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm 12 20000 LARS 15000 3 6 •• 47 • •• 2 10 5 8 7 10 1 • •• 8 61 |ckj | ˆ •• 3 9 ˆ βj 5000 2 -500 9 • 4 8 7 10 ˆ Ck 4 • 7 5 1 6 • 2 • 2 5 0 1000 2000 Pˆ |βj | → 3000 10 • 5 • 8 • 0 lacements 0 4 3• 9 10000 500 9 2 4 6 Step k → 8 6 • 1 • 10 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Lasso 13 LAR 9 500 500 9 3 6 3 6 47 5 2 10 8 61 0 8 7 10 1 3 9 ˆ βj ˆ βj 3 9 4 8 7 10 1 0 4 47 5 2 10 8 61 lacements Pˆ |βj | → 2 −500 −500 2 5 0 500 1500 t= 2500 Pˆ |βj | → 3500 5 0 500 1500 t= 2500 Pˆ |βj | → 3500 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Lasso 14 Stagewise 9 500 500 9 3 6 3 6 47 5 2 10 8 61 0 8 7 10 1 3 9 ˆ βj ˆ βj 3 9 4 8 7 10 1 0 4 47 5 2 10 81 6 lacements Pˆ |βj | → 2 −500 −500 2 5 0 500 1500 t= 2500 Pˆ |βj | → 3500 5 0 500 1500 t= 2500 Pˆ |βj | → 3500 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Relationship between the 3 algorithms • Lasso and forward stagewise can be thought of as restricted versions of LAR • For Lasso: Start with LAR. If a coefficient crosses zero, stop. Drop that predictor, recompute the best direction and continue. This gives the Lasso path Proof (lengthy): use Karush-Kuhn-Tucker theory of convex optimization. Informally: ∂ ||y − Xβ ||2 ∂βj + λ |βj | } = 0 j ⇔ xj , r = λ ˆ sign(βj ) 2 ˆ if βj = 0 (active) 15 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm • For forward stagewise: Start with LAR. Compute best (equal angular) direction at each stage. If direction for any predictor j doesn’t agree in sign with corr(r, xj ), project direction into the “positive cone” and use the projected direction instead. • in other words, forward stagewise always moves each predictor in the direction of corr(r, xj ). • The incremental forward stagewise procedure approximates these steps, one predictor at a time. As step size → 0, can show that it coincides with this modified version of LAR 16 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Summary • LARS—uses least squares directions in the active set of variables. • Lasso—uses least square directions; if a variable crosses zero, it is removed from the active set. • Forward stagewise—uses non-negative least squares directions in the active set. 17 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Benefits • Possible explanation of the benefit of “slow learning” in boosting: it is approximately fitting via an L1 (lasso) penalty • new algorithm computes entire Lasso path in same order of computation as one full least squares fit. Splus/R Software on Hastie’s website: www-stat.stanford.edu/∼hastie/Papers#LARS • Degrees of freedom formula for LAR: After k steps, degrees of freedom of fit = k (with some regularity conditions) • For Lasso, the procedure often takes > p steps, since predictors can drop out. Corresponding formula (conjecture): Degrees of freedom for last model in sequence with k predictors is equal to k. 18 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Degrees of freedom • 10 60 • • df estimate df estimate 50 • 8 • 6 • • 4 30 20 • • 2 40 10 • 0 • •• •• •• •• • •• •• •• ••• •• •• •• • •• •• ••• ••• ••• ••• ••• ••• • ••• •• •• ••• • •• •• 0 0 2 4 6 step k--> 8 10 0 10 20 30 40 step k--> 50 60 19 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Degree of Freedom result n cov(ˆi , yi )/σ 2 = k µ df(ˆ) ≡ µ i=1 Proof is based on is an application of Stein’s unbiased risk estimate (SURE). Suppose that g : Rn → Rn is almost differentiable and set · g = n ∂gi /∂xi . If y ∼ Nn (µ, σ 2 I), then Stein’s formula i=1 states that n cov(gi , yi )/σ 2 = E [ · g (y)]. i=1 LHS is degrees of freedom. Set g (·) equal to the LAR estimate. In orthogonal case, ∂gi /∂xi is 1 if predictor is in model, 0 otherwise. Hence RHS equals number of predictors in model (= k). Non-orthogonal case is much harder. 20 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Software for R and Splus lars() function fits all three models: lasso, lar or forward.stagewise. Methods for prediction, plotting, and cross-validation. Detailed documentation provided. Visit www-stat.stanford.edu/∼hastie/Papers/#LARS Main computations involve least squares fitting using the active set of variables. Computations managed by updating the Choleski R matrix (and frequent downdating for lasso and forward stagewise). glmpath package (with Ph.D student MeeYoung Park) fits L1 -penalized GLM and Cox model paths. 21 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm MicroArray Example • Expression data for 38 Leukemia patients (“Golub” data). • X matrix with 38 samples and 7129 variables (genes) • Response Y is dichotomous ALL (27) vs AML (11) • LARS (lasso) took 4 seconds in R version 1.7 on a 1.8Ghz Dell workstation running Linux. • In 70 steps, 52 variables ever non zero, at most 37 at a time. 22 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm * ** 0.6 * * ** * ** * 0.2 * * * * * ** * * * * ** ** * * * −0.2 ** * lacements 0.0 0.2 0.4 * ** ** * ** ** * * ** * ** * ** * ** ** * * ** * ** ** * ** * ** * ** * * * * ** * ** ** * * ** * ** * ** * * ** * * * * ** * ** ** ** ** ** ** * * * ** * ** ** * ** * **** * * ** ** * ** ** * * ** * ** ** ** ** * * ** ** * ** ** * * ** ** * * * ** * ** ** * ** ** 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 ** ** * * *** ** ** * * ** ** * * * ** * * * * ** ** * * *** ** * ** * ** ** * * * * ** * * * ** ** * * * ** ** * ** * ** * * *** ** ** * ** ** ** * ** * * ** ** * * * * ** ** * * * ** ** * ** ** * ** ** ** 0.8 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * * ***** ** ** ** * ** * ** ** * * ** * ***** *** ** *** * ** ** ***** * * * * ****** * * ** **** ** * **** * * ** ***** **** * **** **** * * * * * * **** ** ** * ** * * * * * ** ** * ***** *** ** *** ** * * ** * * ** ** ** ** * ***** **** * **** ***** *** ** **** * ** ** * * * ***** *** ** **** ** ** ** * *** * ** ** ** * ** * * * **** * * *** * ***** * * ** *** ** ** ** * * * ** * **** ** * * *** ***** * *** ***** **** * **** *** * * * * *** * ** ** * ** * * ***** **** * **** * ** **** **** ** **** * ** *** ** ***** **** * ** * ** *** * ** **** *** ** **** * * ** ** ***** * ** * ** ** ** ** * * * **** **** * **** * *** * * * *** ** * * ** *** * * ** * **** * 1.0 461 * 2968 6801 2945 0.4 * 0.0 Standardized Coefficients * * ** * * ***** * * ** *** * ** * * * ** * * ** ** * * * * ** ** * ** *** ** * * * ** ** * * * * ** * ** ** * ** **** ** ** * * * * *** * * *** *** ** * ** * 2267 0.8 LASSO 23 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm LASSO 3 4 5 * ** 0.6 * * ** 15 20 26 31 38 50 65 * ** * * ***** * * ** *** * ** * * * ** * * ** ** * * * * ** ** * ** *** ** * * * ** ** * * * * ** * ** ** * ** **** ** ** * * * * *** * * *** *** ** * ** * * 0.2 * * * * * ** * * * * ** ** * * * −0.2 0.0 0.2 0.4 * ** ** * ** ** * * ** * ** * ** * ** ** * * ** * ** ** * ** * ** * ** * * * * ** * ** ** * * ** * ** * ** * * ** * * * * ** * ** ** ** ** ** ** * * * ** * ** ** * ** * **** * * ** ** * ** ** * * ** * ** ** ** ** * * ** ** * ** ** * * ** ** * * * ** * ** ** * ** ** 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 ** ** * * *** ** ** * * ** ** * * * ** * * * * ** ** * * *** ** * ** * ** ** * * * * ** * * * ** ** * * * ** ** * ** * ** * * *** ** ** * ** ** ** * ** * * ** ** * * * * ** ** * * * ** ** * ** ** * ** ** ** 0.8 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * * ***** ** ** ** * ** * ** ** * * ** * ***** *** ** *** * ** ** ***** * * * * ****** * * ** **** ** * **** * * ** ***** **** * **** **** * * * * * * **** ** ** * ** * * * * * ** ** * ***** *** ** *** ** * * ** * * ** ** ** ** * ***** **** * **** ***** *** ** **** * ** ** * * * ***** *** ** **** ** ** ** * *** * ** ** ** * ** * * * **** * * *** * ***** * * ** *** ** ** ** * * * ** * **** ** * * *** ***** * *** ***** **** * **** *** * * * * *** * ** ** * ** * * ***** **** * **** * ** **** **** ** **** * ** *** ** ***** **** * ** * ** *** * ** **** *** ** **** * * ** ** ***** * ** * ** ** ** ** * * * **** **** * **** * *** * * * *** ** * * ** *** * * ** * **** * 1.0 461 * ** 2968 6801 2945 0.4 * ** * lacements 12 * 0.0 Standardized Coefficients * 8 2267 1 0.8 0 24 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm 0.15 0.05 0.10 cv 0.20 0.25 10−fold cross−validation for Leukemia Expression Data (Lasso) lacements 0.0 0.2 0.4 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 0.8 1.0 25 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm * 0.0 * * * ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ** * * ** * ** ** * ** * * ** * ** * ** ** * ** * ** * ** ** * ** * ** * ** * ** * ** ** ** ** ** ** ** ** ** ** ** ** ** ** * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * ** ** * * * * * * * * * * * * * * * * * ** ** ** lacements * * * * * * * * * * * * * * * * * * * 5039 * * * * * * * * * * * * * * * * * * * * * 2534 * * * * * * * * * * * * * * * * * * * * * * 1241 0.5 * * * * * * * * * * * * * * * * * * * * −0.5 Standardized Coefficients * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *** ** ** 1817 4328 * ** ** 6895 ** **** * * * ** ** * * * **** * * * * ** * ** * * ** * ** * *** * * * ** * * * * ** * * * * ** **** * * * * ** ******* * * * * * ** *** * * * * * ** ** * * ** * * * ** * ****** * * * * ****** * * ** ** ******* * ** ** ** **** * ****** * **** * ***** * ** **** * * * ** ** * * ** ** ** ****** * * * ** ** *** * * * 1882 * ** ** **** ** ** * 2267 1.0 LAR * 0.0 0.2 0.4 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 0.8 1.0 26 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm LAR 4 8 13 20 25 29 32 35 36 37 * * ** ** * 0.0 * * * ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ** * * ** * ** ** * ** * * ** * ** * ** ** * ** * ** * ** ** * ** * ** * ** * ** * ** ** ** ** ** ** ** ** ** ** ** ** ** ** * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * ** ** * * * * * * * * * * * * * * * * * ** ** ** lacements * * * * * * * * * * * * * * * * * * * 5039 * * * * * * * * * * * * * * * * * * * * * 2534 * * * * * * * * * * * * * * * * * * * * * * 1241 0.5 * * * * * * * * * * * * * * * * * * * * −0.5 Standardized Coefficients * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *** ** ** 1817 4328 * 6895 ** **** * * * ** ** * * * **** * * * * ** * ** * * ** * ** * *** * * * ** * * * * ** * * * * ** **** * * * * ** ******* * * * * * ** *** * * * * * ** ** * * ** * * * ** * ****** * * * * ****** * * ** ** ******* * ** ** ** **** * ****** * **** * ***** * ** **** * * * ** ** * * ** ** ** ****** * * * ** ** *** * * ** ** **** ** ** * 1882 3 2267 2 1.0 0 * 0.0 0.2 0.4 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 0.8 1.0 27 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm 0.15 0.05 0.10 cv 0.20 0.25 10−fold cross−validation for Leukemia Expression Data (LAR) lacements 0.0 0.2 0.4 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 0.8 1.0 28 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm ** ** * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** * ** 0.4 * * ** 1834 0.6 ** 2267 * ** * * * * * 0.0 * * ** * * * ** ** * * * −0.2 ** * 0.0 0.2 0.4 * * * * * * * * ** ** * ** ** ** ** ** ** ** ** ** ** * * ** ** ** ** * * ** ** ** * * * ** ** ** * ** * ** * * **** * * ** ** * * * ***** * * * ** * * * * **** * * ** * * * * ** ** * ** * * ** * **** * * ** * * ** ** * * **** * * *** * ** ** ** * * * * **** * * * * ** * * **** * * ** * * **** * * ** ** * * * ** * ** *** * * ** * * * *** * * * ** * *** * ** * ** * * **** * * ** ** * * * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 ****** ** ** ** *********** * ***** ** ******** ******** * ***** ** ********** * ***** ** ** ** * ****** ** ** ** *********** * **** ***** ** ******** * ***** * * ** ** ***** * ***** ** *********** ****** ***** ** *********** ** * * ***** ** ** ************* * ** ** ************ * * ** ** ***** ***** ** ** ************* ***** **** * *** * ***** ** * * ************* ** * ***** ***** ************** * ** * ** ******** ***** *** *** ******** * ***** ***** ** *********** * * * * ******** * *** *** * ** ******** ** *** ****** * **** ** ********* ** * * ***** ***** ** *********** ** ** * ***** * ******** ***** ** ** ** ******** * ***** ** ** ** ******** ****** ** ** ** *********** ****** ***** ** *********** * ** ***** **** ** *********** * ** ***** ** * ** ******** *** *** * **** ** ** ************ ** * * *** * ***** ** ******** *** ** ** ** *********** ****** *** ****** ** ** ** *********** * ***** ** ******** ****** ** ** ** *********** * ***** ** ******** 0.8 1.0 4535 4399 87 2774 * 461 * 0.2 Standardized Coefficients ** ** * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** * * * lacements * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** ** ** * ** ** ** * * ** 1882 Forward Stagewise 29 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Forward Stagewise 4 5 0.6 15 19 29 43 54 82 121 ** ** * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** ** ** * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** * * ** * ** ** * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** * * ** * * * ** * ** 0.4 * * * ** * * 0.0 * * ** * * * ** ** * * * −0.2 ** * 0.0 0.2 0.4 * * * * * * * * ** ** * ** ** ** ** ** ** ** ** ** ** * * ** ** ** ** * * ** ** ** * * * ** ** ** * ** * ** * * **** * * ** ** * * * ***** * * * ** * * * * **** * * ** * * * * ** ** * ** * * ** * **** * * ** * * ** ** * * **** * * *** * ** ** ** * * * * **** * * * * ** * * **** * * ** * * **** * * ** ** * * * ** * ** *** * * ** * * * *** * * * ** * *** * ** * ** * * **** * * ** ** * * * * **** * * ****** ** ** ** *********** ** ** * * * ***** ** ******** 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 ****** ** ** ** *********** * ***** ** ******** ******** * ***** ** ********** * ***** ** ** ** * ****** ** ** ** *********** * **** ***** ** ******** * ***** * * ** ** ***** * ***** ** *********** ****** ***** ** *********** ** * * ***** ** ** ************* * ** ** ************ * * ** ** ***** ***** ** ** ************* ***** **** * *** * ***** ** * * ************* ** * ***** ***** ************** * ** * ** ******** ***** *** *** ******** * ***** ***** ** *********** * * * * ******** * *** *** * ** ******** ** *** ****** * **** ** ********* ** * * ***** ***** ** *********** ** ** * ***** * ******** ***** ** ** ** ******** * ***** ** ** ** ******** ****** ** ** ** *********** ****** ***** ** *********** * ** ***** **** ** *********** * ** ***** ** * ** ******** *** *** * **** ** ** ************ ** * * *** * ***** ** ******** *** ** ** ** *********** ****** *** ****** ** ** ** *********** * ***** ** ******** ****** ** ** ** *********** * ***** ** ******** 0.8 1.0 4535 4399 87 2774 * 461 * 0.2 Standardized Coefficients 12 ** ** * ** lacements 8 1882 3 2267 1 1834 0 30 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm 0.15 0.05 0.10 cv 0.20 0.25 10−fold cross−validation for Leukemia Expression Data (Stagewise) lacements 0.0 0.2 0.4 0.6 ˆ ˆ ||βλ ||1 /||βLS ||1 0.8 1.0 31 SLDM III c Hastie & Tibshirani - February 23, 2009 Lasso and LARS algorithm Summary • the lasso and associated methods are potentially useful for wide (p > N ) data • in a series of papers in 2004, Donoho (Stanford) shows that the lasso gives a good approximation to the L0 solution (best subsets), if the true coefficient vector is sufficiently sparse. • however- this begs the question: how good is either the L0 or lasso when p >> N ? Our experience is mixed. • this is a current topic of interest 32 SLDM III c Hastie & Tibshirani - February 23, 2009 Software • lars package available R and S-PLUS • GLMSELECT in SAS Lasso and LARS algorithm 33 ...
View Full Document

Ask a homework question - tutors are online