{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

# mar16 - STA 414/2104 Notes No class on Thursday Mar 18...

This preview shows pages 1–6. Sign up to view the full content.

STA 414/2104 Mar 16, 2010 Notes I No class on Thursday, Mar 18 I Takehome MT due Mar 25 I Paper ”Kernels and Ensembles” by M. Zhu, is posted under March 9 I “Quick R”: http://www.statmethods.net/index.html I CRAN Task Views: http://cran.r-project.org/ web/views/MachineLearning.html 1 / 17

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
STA 414/2104 Mar 16, 2010 Classification and Regression Trees South African heart data | age< 50.5 age< 30.5 typea< 68.5 famhist=a tobacco< 7.605 ldl< 4.99 adiposity>=27.98 tobacco< 4.15 0 0 1 0 1 0 1 1 1 2 / 17
STA 414/2104 Mar 16, 2010 ... heart data > data(SAheart) > names(SAheart) [1] "sbp" "tobacco" "ldl" "adiposity" "famhist" [6] "typea" "obesity" "alcohol" "age" "chd" > (heartree = rpart(chd ˜ ., data = SAheart, method="class")) ## output follows ## n= 462 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 462 160 0 (0.653680 0.346320) 2) age< 50.5 290 64 0 (0.779310 0.220690) 4) age< 30.5 108 8 0 (0.925926 0.074074) * 5) age>=30.5 182 56 0 (0.692308 0.307692) 10) typea< 68.5 170 46 0 (0.729412 0.270588) * 11) typea>=68.5 12 2 1 (0.166667 0.833333) * 3) age>=50.5 172 76 1 (0.441860 0.558140) 6) famhist=Absent 82 33 0 (0.597561 0.402439) 12) tobacco< 7.605 58 16 0 (0.724138 0.275862) * 13) tobacco>=7.605 24 7 1 (0.291667 0.708333) * 7) famhist=Present 90 27 1 (0.300000 0.700000) 14) ldl< 4.99 39 18 1 (0.461538 0.538462) 28) adiposity>=27.985 20 7 0 (0.650000 0.350000) 56) tobacco< 4.15 10 1 0 (0.900000 0.100000) * 57) tobacco>=4.15 10 4 1 (0.400000 0.600000) * 29) adiposity< 27.985 19 5 1 (0.263158 0.736842) * 15) ldl>=4.99 51 9 1 (0.176471 0.823529) * 3 / 17

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
STA 414/2104 Mar 16, 2010 ... heart data > plot(heartree, margin = .10) > text(heartree) # depth of branches proportional to reduction in error > plot(heartree, margin = .10, compress = T, uniform = T, branch = 0.4) > text(heartree, use.n = T) # depth of branches is uniform > post(heartree) # makes a file called heartree.ps in the local directory > printcp(heartree) Classification tree: rpart(formula = chd ˜ ., data = SAheart, method = "class") Variables actually used in tree construction: [1] adiposity age famhist ldl tobacco typea Root node error: 160/462 = 0.346 n= 462 CP nsplit rel error xerror xstd 1 0.1250 0 1.000 1.000 0.0639 2 0.1000 1 0.875 1.056 0.0647 3 0.0625 2 0.775 1.000 0.0639 4 0.0250 3 0.713 0.863 0.0615 5 0.0188 5 0.663 0.831 0.0608 6 0.0125 7 0.625 0.875 0.0617 7 0.0100 8 0.613 0.931 0.0628 > table(actual=SAheart\$chd,predicted=predict(heartree,type="class")) predicted actual 0 1 0 275 27 1 71 89 > 1-sum(diag(.Last.value))/sum(.Last.value) 4 / 17
STA 414/2104 Mar 16, 2010 Forensic glass data(fgl)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}