Assignment 3_answers_DSC441.docx - Problem 1 1 Best split...

This preview shows page 1 - 4 out of 14 pages.

Problem 1 : 1. Best split and stopping condition for following decision Tree: Using Crossvalidation decision tree in SPSS. Maximum Tree Depth: 20 Minimum number of cases for parent node= 10 Minimum number of Cases for children node= 5 Growing Method: CRT CRT impurity Measure: Gini (default) where splits are found that maximize the homogeneity of child noes with respect to the value of the target variable
2. Number of nodes of the final tree = 11
Number of terminal nodes= 6 Depth = 3 Overall classification percentage = 95.7% 3. Most important three lupus features in building tree are S10 (symptom 10), S1 (symptom 1) and S 11 (symptom 11) since they are the first to be split at the top of the tree performing variable selection. 4. If you increase number of cases for each parent and child, the complexity of the tree decreases. Increasing cases creates larger/wider bins which fits more data and doing less variable selection and less data partitioning. For example, when I increased the Parent to

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture