SAS GF paper on RPM

Selected based on validation data statistics figure 7

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ws the model fit statistics for the selected model. Sample statistics include the misclassification rate, root average squared error, Kolmogorov-Smirnov statistic, Gini coefficient, and lift at various depths of file. Fit statistics are included for both the training and valdiation data. The SAS Rapid Predictive Modeler also outputs a model fit comparison report when the advanced model is selected. In this analysis, the SAS Enterprise Miner Autoneural algorithm was selected as the best model based on maximizing cumulative lift for the validation data. The Reg2 model is a forward logistic regression from the basic modeling methodology that resulted in a baseline model by selecting no terms from your input variables. This can happen when there is a weak signal or a signal that depends on combinations of multiple terms. 7 SAS Global Forum 2010 Customer Intelligence Figure 7. Goodness of Model Fit and Comparison Statistics The most common report feature for a class target variable model is the Model Gains Chart. For this chart the customer cases are sorted from left to right by the individuals who are most likely to have churned as predicted by the selected model. The cumulative captured response is a measure of how many class target events are identified in each percentile. Figure 8 shows that about 41% of the events have been identified in the first 5% of cases as ranked by the predicted values. At the twentieth percentile, just over three-fourths of the target event cases have been identified. Lift is a measure of the ratio of target events identified by the model to target events found by random selection. Most business users know how to use a gains chart. This plot is available only for models of class target variables. 8 SAS Global Forum 2010 Customer Intelligence Figure 8. Cumulative Model Gains Chart The Receiver-Operator Characteristic (ROC) plot is adapted from the field of engineering. This plot shows the maximum predictive power for a model for the entire sample rather than for a single decile. The data are plotted as sensitivity versus (one minus) specificity. The separation between the model curve and the diagonal, representing a random selection model, is termed the Komolgorov-Smirinov (KS) value. A higher KS value represents a more powerful model. Figure 9 shows the ROC plot for the model. Figure 9. Receiver Operating Characteristic Plot The SAS Rapid Predictive Modeler always outputs a scorecard to interpret the model’s characteristics for business purposes. Each interval variable is binned into distinct ranges of values. Each variable is ranked by importance in the model and scaled to a maximum of 1,000 points. Each distinct value of each variable then receives a portion of the 9 SAS Global Forum 2010 Customer Intelligence scaled point total. Scorecards provide a quick view into the behavior of the model. Figure 10 displays the churn scorecard developed using the SAS Rapid Predictive Modeler task within the SAS Add-In for Microsoft...
View Full Document

Ask a homework question - tutors are online