6 cv te error te 04 te cv tr te cv te te 02 tr te te

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: enes, and different numbers of genes for each class. 25 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Class probabilities • For a test sample x∗ = (x∗ , x∗ , . . . x∗ ), we define the discriminant 1 2 p score for class k p δk (x∗ ) = j =1 ¯ (x∗ − xj k )2 j − 2 log πk 2 sj (5) • The classification rule is then C (x∗ ) = if δ (x∗ ) = min δk (x∗ ) k (6) • estimates of the class probabilities, by analogy to Gaussian linear discriminant analysis, are 1 pk (x∗ ) = ˆ e− 2 δk (x ∗ ) 1 ∗ K e− 2 δ (x ) =1 (7) 26 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Results on Khan data At optimal point, there are 43 active genes 1 5 8 10 15 22 34 52 81 133 206 339 598 1020 1668 2188 2308 Size 0.8 te te tr 0.6 cv te Error te 0.4 te cv tr te cv te te 0.2 tr te te cv 0.0 0 tr te cv cv te te te te te te cv tr tr tr cv tr cv tr cv te te te 2 4 6 Amount of Shrinkage Delta 27 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Training Data 1.0 Probability 0.8 BL EWS ••••• ••• ••••• ••• ••••• ••• • • •• • • • NB •• • •••• • • 0.6 •• RMS ••••• •• •••••• ••• • • • • • 0.4 0.2 0.0 • • • • • ••••• •• • •• ••• •• ••• • • •• ••••••••••••••••••••••••••••••••••••• ••••••••••••••••••••••••• ••• •• •• • • •• ••••••••••••••• 0 10 20 30 40 50 60 Sample Test Data BL 1.0 • EWS ••• •• NB • Probability 0.8 0.6 • O 0.4 0.2 0.0 • • O • • • • •• • O • • O • • • • • RMS ••• • • O • • • • • • • • •• ••• • • ••• ••••• • • •• •• •• • • ••••• • ••••••• • 5 10 15 20 25 Sample 28 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani BL EWS NB RMS The genes that matter 813841 859359 207274 296448 898219 784224 796258 244618 789253 298062 461425 1409509 42558 769716 25725 44563 325182 812105 41591 810057 52076 866702 814260 43733 357031 1435862 770394 377461 1473131 295985 241412 80109 183337 233721 897788 563673...
View Full Document

This document was uploaded on 03/10/2014 for the course STATS 315A at Stanford.

Ask a homework question - tutors are online