This preview shows page 1. Sign up to view the full content.
Unformatted text preview: onnull False Neg. (FN) True Pos. (TP) P Total N∗ P∗ Possible results when applying a classiﬁer or diagnostic test to a population. 52 ESL Chapter 4 — Linear Methods for Classiﬁcation Name Trevor Hastie and Rob Tibshirani Deﬁnition Synonyms False Pos. Rate FP/N Type I Error, 1−Speciﬁcity True Pos. Rate TP/P 1−Type II Error, Power, Sensitivity, Recall Pos. Pred. Value TP/P∗ Neg. Pred. Value TN/N∗ Precision, 1−False Discovery Proportion Important measures for classiﬁcation and diagnostic testing 53 ESL Chapter 4 — Linear Methods for Classiﬁcation Trevor Hastie and Rob Tibshirani Multiple Logistic Regression
T
Model is deﬁned in terms of J − 1 logits ηj (X ) = βj X : P (G = 1X )
P (G = J  X )
P (G = 2X )
log
P (G = J  X ) log = η1 (X ) = η2 (X ) .
.
.
log P (G = J − 1X )
P (G = J  X ) P (G = j  X ) = = ηJ −1 (X ) eηj (X )
1+ J − 1 η (X )
=1 e Fit by least squares or multinomial maximum likelihood. 54 ESL Chapter 4 — Linear Methods for Classiﬁcation Logistic Regression with p Trevor Hastie and Rob Tibshirani N • Typically linear models are sufﬁcient — logit(pi ) = β T xi
• Models have to be regularized
– ridge penalty — similar to SVM
N {yi log pi + (1 − yi ) log(1 − pi )} − λβ 2 PLL =
i=1 – lasso penalty — selects variables
p N βj  {yi log pi + (1 − yi ) log(1 − pi )} − λ PLL =
i=1 j =1 • IRLS algorithm for ridge, and LARSlike algorithm for Lasso 55 ESL Chapter 4 — Linear Methods for Classiﬁcation Trevor Hastie and Rob Tibshirani Glmnet software in R
Glmnet ﬁts the GLM family of models by penalized maximum
likelihood. This includes (multiple) logistic regression. Glmnet
computes the entire “regularization path” for the “elastic net” penalty
family:
1
max l(β ) − λ (1 − α)β 2 + αβ 1
2
β
2
• The regularization path follows a complete grid of values for λ, with
α ﬁxed.
• α spans ridge to lasso
• For multiple logistic regression, the model is symmetric...
View
Full
Document
This document was uploaded on 03/10/2014 for the course STATS 315A at Stanford.
 Spring '10
 TIBSHIRANI,R
 Statistics, Linear Regression

Click to edit the document details