Model Comparison 3 Principle of Model validation continued Apr 3 2020 89 A good

Model comparison 3 principle of model validation

This preview shows page 89 - 94 out of 107 pages.

Model Comparison 3. Principle of Model validation - continued Apr 3, 2020 89
A good regression model predicts values of the response variable very close to the observed response values. The difference between predicted value and observed value of a response variable is called as prediction error. In Ordinary Least Squares (OLS) regression model, four statistics are used to evaluate the fitness of the model: 1. R squared and Adjusted R squared 2. F test 3. Root Mean Square Error (RMSE) 4. Mean Absolute Percentage Error (MAPE) Model Comparison 4. Prediction Error Apr 3, 2020 90 Regression problems
The above statistics are based on Total Sum of Squares (SST) and Error Sum of Squares (SSE). SST measures how far the data are from the mean while SSE measures how far the data are from the model's predicted values. i. R squared is obtained by dividing the difference (between SST and SSE) by SST. Adjusted R square incorporates the model's degrees of freedom. Adjusted R square is interpreted as the proportion of total variance that is explained by the model. ii. F test evaluates the null hypothesis that all regression coefficients are equal to zero versus the alternative that at least one is not. F test determines whether the proposed relationship between the dependent variable and the set of independent variables is statistically reliable. Model Comparison 4. Prediction Error - continued Apr 3, 2020 91
iii. RMSE is the square root of the variance of the residuals. It indicates the absolute fit of the model to the data. RMSE is an absolute measure of fit. RMSE can be interpreted as the standard deviation of the unexplained variance and also has the same unit of measure as the response variable. iv. RMSE is a good measure of how accurately the model predicts the response. v. Mean Absolute Percentage Error (MAPE) is a measure of prediction accuracy of a forecasting method for example in trend estimation. Refer to iii. Mean Absolute Error (MAE) is the mean of absolute errors and it is difficult to distinguish between big and small errors. MAPE calculates the mean absolute error in percentage terms, thus allowing us to compare forecasts of data with different units of measure using different models. Model Comparison 4. Prediction Error - continued Apr 3, 2020 92
Model Comparison 4. Prediction Error - continued Apr 3, 2020 93 Classification problems In classification problems, we measure how the model the sample data is correctly classified to their category. Here, the objective is to find a classifier such as Logistic Regression, CART, Random Forest etc. that performs well in predicting classes for new data for which the response is not known. In classification models, error or residual is the count of misclassified observations.

You've reached the end of your free preview.

Want to read all 107 pages?

What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern