Model Comparison 6 a Example Regression problem Apr 3 2020 102 Instead of

Model comparison 6 a example regression problem apr 3

This preview shows page 102 - 107 out of 107 pages.

Model Comparison 6. a. Example Regression problem Apr 3, 2020 102 Instead of splitting the data into training and testing data, we shall use 10 fold cross validation for each of the models under consideration. Let us compare Ordinary Least Square (OLS) model. regularization techniques such as Ridge and Lasso models. We shall compare RMSE and R squared value to identify the best model.
Image of page 102
Model Comparison 6. a. Example Regression problem Apr 3, 2020 103 Inference We observe that OLS model a. has the least median RMSE 10.94 of 10 folds when compared to regularization models such as Ridge or Lasso. b. has the highest mean R squared 57.10% and median R squared 57.25% when compared to regularization model Lasso. Ridge model gives the same mean R squared and median R squared. Click here to view the code.
Image of page 103
Model Comparison 6. b. Example Classification problem Apr 3, 2020 104 We shall use Pima Indians diabetes data set*, a summary from a collection of medical reports and indicate the onset of diabetes in the patients within five years. Three models constructed and tuned are CART, Random Forest, Naïve Bayes and K Nearest Neighbor. Each model is automatically tuned and is evaluated using three repeats of 10-fold cross validation. To ensure that each algorithm gets the same data partitions and repeats, a random number seed is set before running each algorithm. Models are trained and an optimal parameter configuration is found for each model. We collect accuracy results from each of the best models. The distributions are summarized in terms of percentiles. Boxplots are drawn to the distributions.
Image of page 104
Model Comparison 6. b. Example Classification problem Apr 3, 2020 105 We observe that the Logistic regression gives the best Sensitivity and Specificity when compared to CART, Logistic Regression, Naive Bayes and KNN. With regard to AUROC, Naive Bayes model gives the best prediction!!
Image of page 105
Model Comparison 6. b. Example Classification problem Apr 3, 2020 106 Click here to view the code. Visual comparison
Image of page 106
1. Business Statistics – a first course – Davind M. Levine, Kathryn A. Szabat, David F Stephan and Dr P K Viswanathan Chapter 12 2. Business Analytics – The Science of Data Driven Decision Making – U Dinesh Kumar Model Comparison 7. Reference Apr 3, 2020 107
Image of page 107

You've reached the end of your free preview.

Want to read all 107 pages?

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture