mar16 - STA 414/2104 Mar 16, 2010 Notes I No class on...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STA 414/2104 Mar 16, 2010 Notes I No class on Thursday, Mar 18 I Takehome MT due Mar 25 I Paper ”Kernels and Ensembles” by M. Zhu, is posted under March 9 I “Quick R”: http://www.statmethods.net/index.html I CRAN Task Views: http://cran.r-project.org/ web/views/MachineLearning.html 1 / 17 STA 414/2104 Mar 16, 2010 Classification and Regression Trees South African heart data | age< 50.5 age< 30.5 typea< 68.5 famhist=a tobacco< 7.605 ldl< 4.99 adiposity>=27.98 tobacco< 4.15 1 1 1 1 1 2 / 17 STA 414/2104 Mar 16, 2010 ... heart data > data(SAheart) > names(SAheart) [1] "sbp" "tobacco" "ldl" "adiposity" "famhist" [6] "typea" "obesity" "alcohol" "age" "chd" > (heartree = rpart(chd ˜ ., data = SAheart, method="class")) ## output follows ## n= 462 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 462 160 0 (0.653680 0.346320) 2) age< 50.5 290 64 0 (0.779310 0.220690) 4) age< 30.5 108 8 0 (0.925926 0.074074) * 5) age>=30.5 182 56 0 (0.692308 0.307692) 10) typea< 68.5 170 46 0 (0.729412 0.270588) * 11) typea>=68.5 12 2 1 (0.166667 0.833333) * 3) age>=50.5 172 76 1 (0.441860 0.558140) 6) famhist=Absent 82 33 0 (0.597561 0.402439) 12) tobacco< 7.605 58 16 0 (0.724138 0.275862) * 13) tobacco>=7.605 24 7 1 (0.291667 0.708333) * 7) famhist=Present 90 27 1 (0.300000 0.700000) 14) ldl< 4.99 39 18 1 (0.461538 0.538462) 28) adiposity>=27.985 20 7 0 (0.650000 0.350000) 56) tobacco< 4.15 10 1 0 (0.900000 0.100000) * 57) tobacco>=4.15 10 4 1 (0.400000 0.600000) * 29) adiposity< 27.985 19 5 1 (0.263158 0.736842) * 15) ldl>=4.99 51 9 1 (0.176471 0.823529) * 3 / 17 STA 414/2104 Mar 16, 2010 ... heart data > plot(heartree, margin = .10) > text(heartree) # depth of branches proportional to reduction in error > plot(heartree, margin = .10, compress = T, uniform = T, branch = 0.4) > text(heartree, use.n = T) # depth of branches is uniform > post(heartree) # makes a file called heartree.ps in the local directory > printcp(heartree) Classification tree: rpart(formula = chd ˜ ., data = SAheart, method = "class") Variables actually used in tree construction: [1] adiposity age famhist ldl tobacco typea Root node error: 160/462 = 0.346 n= 462 CP nsplit rel error xerror xstd 1 0.1250 1.000 1.000 0.0639 2 0.1000 1 0.875 1.056 0.0647 3 0.0625 2...
View Full Document

This document was uploaded on 08/12/2010.

Page1 / 17

mar16 - STA 414/2104 Mar 16, 2010 Notes I No class on...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online