Algebraically this is expressed as dj k signdjk djk

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: k for gene ¯ j ; the j th component of the overall centroid is xj = n xij /n. ¯ i=1 21 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani • Let djk = (¯jk − xj )/sj , x ¯ (1) where sj is the pooled within class standard deviation for gene j : s2 = j 1 n−K (xij − xjk )2 . ¯ (2) k i∈Ck • Shrink each djk towards zero, giving dj k and new shrunken centroids or prototypes xj k = xj + sj dj k ¯ ¯ (3) 22 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani ∆ (0,0) • The shrinkage is soft-thresholding: each djk is reduced by an amount ∆ in absolute value, and is set to zero if its absolute value is less than zero. Algebraically, this is expressed as dj k = sign(djk )(|djk | − ∆)+ (4) where + means positive part (t+ = t if t > 0, and zero otherwise). • Choose ∆ by cross-validation. 23 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Connection to Lasso Exercise 18.12 in ESL: Consider a (naive Bayes) Gaussian model for classification in which the features j = 1, 2, . . . , p are assumed to be independent within each class k = 1, 2, . . . , K . With observations i = 1, 2, . . . , N and Ck equal to the set of indices of the Nk observations in class k , we observe 2 xij ∼ N (µj + µjk , σj ) for i ∈ Ck with K µjk = 0. k=1 Set σj = s2 , the pooled within-class variance for feature j , and consider ˆ2 j the lasso-style minimization problem p K 1 p K 2 (xij − µj − µjk ) µjk | + λ Nk . | min 2 2 sj sj {µj ,µjk } j =1 j =1 k=1 i∈Ck k=1 The solution is equivalent to the nearest shrunken centroid classifier. 24 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Advantages • Simple, includes nearest centroid classifier as a special case. • Thresholding denoises large effects, and sets small ones to zerothereby selecting genes • with more than two classes, method can select different g...
View Full Document

This document was uploaded on 03/10/2014 for the course STATS 315A at Stanford.

Ask a homework question - tutors are online