Assignment 3 - Alexandre Girault -441.docx - Alexandre...

This preview shows page 1 - 8 out of 9 pages.

Alexandre Girault 10/16/2018 Assignment 3 – DSC 441 Problem 1: 1
Alexandre Girault 10/16/2018 2
Alexandre Girault 10/16/2018 1) In order to make the best decision tree, we first need to create a subset of the dataset in order to be able to compare the risks on an equal basis. To do so, the variable filter_$ was created, taking approximately 70% of the values. This variable will therefore be used as the training set. Using the criteria above, the overall correct percentage of the model reach 93,2%. 2) The final tree has 13 nodes and 7 terminal nodes (cf. classification tree table). 3) The three most important data features in building the tree are respectively V10, V11 and V9. 4) When we increase the parameters of cases allowed in parent and child nodes, the complexity given by the number of nodes reduce. By increasing the minimum of cases, in the nodes, we give more tolerance in the decision tree and therefore remove complexity. 3
Alexandre Girault 10/16/2018 Problem 2: 1. There are 6 different classes, from 3 to 8. The distribution has a very strong central tendency. 2. 4
Alexandre Girault 10/16/2018 5
Alexandre Girault 10/16/2018 6
Alexandre Girault 10/16/2018

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture