Rpart_TechReport61

# As iid bernoulli p fxi 1g 5 and are independent of y

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: e of size 200 was generated accordingly and the procedure applied using the gini index see 3.2.1 to build the tree. The S-plus code to compute the simulated data and the t are shown below. n - 200 temp - c1,1,1,0,1,1,1, 0,0,1,0,0,1,0, 1,0,1,1,1,0,1, 1,0,1,1,0,1,1, 0,1,1,1,0,1,0, 1,1,0,1,0,1,1, 0,1,0,1,1,1,1, 1,0,1,0,0,1,0, 1,1,1,1,1,1,1, 1,1,1,1,0,1,0 lights - matrixtemp, 10, 7, byrow=T  The true light pattern 0-9 temp1 - matrixrbinomn*7, 1, .9, n, 7  Noisy lights temp1 - ifelselights y+1, ==1, temp1, 1-temp1 temp2 - matrixrbinomn*17, 1, .5, n, 17 Random lights x - cbindtemp1, temp2 x is the matrix of predictors 14 x.7>0.5 | x.3>0.5 x.4<0.5 x.6<0.5 x.1<0.5 x.1>0.5 x.4<0.5 5 2 6 1 x.1<0.5 4 7 9 x.5<0.5 0 3 8 Figure 4: Optimally pruned tree for the stochastic digit recognition data y - rep0:9, length=200 The particular data set of this example can be replicated by setting .Random.seed to c21, 14, 49, 32, 43, 1, 32, 22, 36, 23, 28, 3 before the call to rbinom. Now we t the model: temp3 - rpart.controlxval=10, minbucket=2, minsplit=4, cp=0 dfit - rparty x, method='class', control=temp3 printcpdfit  Classification tree: rpartformula = y x, method = "class", control = temp3  Variables actually used in tree construction: 1 x.1 x.10 x.12 x.13 x.15 x.19 x.2 x.20 x.22 x.3 Root node error: 180 200 = 0.9 1 2 3 4 CP nsplit rel error 0.1055556 0 1.00000 0.0888889 2 0.79444 0.0777778 3 0.70556 0.0666667 5 0.55556 xerror 1.09444 1.01667 0.90556 0.75000 15 xstd 0.0095501 0.0219110 0.0305075 0.0367990 x.4 x.5 x.6 x.7 x.8 5 6 7 8 9 10 11 12 0.0555556 0.0166667 0.0111111 0.0083333 0.0055556 0.0027778 0.0013889 0.0000000 8 9 11 12 16 27 31 35 0.36111 0.30556 0.27222 0.26111 0.22778 0.16667 0.15556 0.15000 0.56111 0.36111 0.37778 0.36111 0.35556 0.34444 0.36667 0.36667 0.0392817 0.0367990 0.0372181 0.0367990 0.0366498 0.0363369 0.0369434 0.0369434 fit9 - prunedfit, cp=.02 plotfit9, branch=.3, compress=T textfit9 The cp table di ers from that in section 3.5 of 1 in several ways, the last two of which are somewhat important. The actual values are di erent, of course, because of di erent random number generators in the two runs. The table is printed from the smallest tree no splits to the largest one 35 splits. We nd it easier to compare one tree to another when they start at the same place. The number of splits is listed, rather than the number of nodes. The number of nodes is always 1 + the number of splits. For easier reading, the error columns have been scaled so that the rst node has an error of 1. Since in this example the model with no splits must make 180 200 misclassi cations, multiply columns 3-5 by 180 to get a result in terms of absolute error. Computations are done on the absolute error scale, and printed on relative scale. The complexity parameter column cp has been similarly scaled. Looking at the cp table, we see that the best tree has 10 terminal nodes 9 splits, based on cros...
View Full Document

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern