DSC441 Assignment 3_1.docx - In this document I post the...

This preview shows page 1 - 5 out of 17 pages.

In this document I post the screenshots first and then answer the question afterwards. For some screenshots I write about them. I use the figure # to indicate which screenshot I am talking about. 1. Screenshots: Figure 1
Figure 3 Figure 2
Figure 4 Figure 5
1.1. Growing Method: CRT VALIDATION: Holdout Partitioning with 50% training and 50% testing split Maximum tree depth 10 Minimum cases for parent node 20 Minimum cases for child node 10 Impurity measure: Gini Index 1.2. The final tree has 5 nodes and 3 terminal nodes 1.3. Looking at figure 5, we can see that v10 is arguably the most important data feature, with V1, and V11 following behind it. Looking at both the importance, and normalized importance V10,V1,V11 are clearly higher percentages. 1.4. After increasing maximum tree depth to 20, minimum parent node 30, and minimum child node 15. I saw the risk increase from training at .072, to .099, and test go from .081, to .087. Within Classification: For training; The 1 overall percentage increased from 44.1 to 54.3, but the 2 overall percentage decreased from 55.9 to 45.7. For test: The 1 overall percentage decreased from 52.7 to 51, but the 2 overall percentage increased from 47.3 to 49. The independent

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture