AUC 097 Evaluating probability models Log Likelihood Logarithm of the product

# Auc 097 evaluating probability models log likelihood

This preview shows page 19 - 33 out of 52 pages.

AUC = 0.97 Evaluating probability models Log Likelihood Logarithm of the product of the probability the model assigne d to each example Log likelihood the model assigns to the test data, model is better as we get closer to 0 Log likelihood rescaled by the number of data points to give us a rough average surprise per data point. Evaluating probability models Log Likelihood The spam model assigns a log likelihood of -134.9478, which is much better than the null model’s -306.8952. Evaluating probability models Deviance The deviance is defined as -2*(logLikelihood-S), where S is a technical constant called “the log likelihood of the saturated model.” The lower the residual deviance, the better the model. We’re most concerned with differences of deviance, such as t he difference between the null deviance and the model devia nce In our case, this difference is -2*(-306.8952-S) - -2*(-134.9 478- S)=344.9. Evaluating probability models Akaike information criterion (AIC) AIC = deviance + 2*numberOfParameters AIC is deviance penalized for model complexity Useful for comparing models with different measur es of complexity and variables with differing numbe r of levels. Evaluating probability models Entropy Technical measure of information or surprise, and is measur ed in a unit called bits . Conditional entropy is a measure that gives an indication of h ow good the prediction is on different categories, tempered b y how often it predicts different categories. Initial entropy is 0.97 bits per example, a lot of surprise. The conditional entropy is only 0.39 bits per example. Evaluating ranking models Ranking models Given a set of examples, sort the rows or assign ra nks to the rows. Often trained by converting groups of examples int o many pair-wise decision. Evaluation method Method used for evaluating classifiers Spearman’s rank correlation coefficient Data mining concepts of lift Evaluating clustering models Hard to evaluate because it is unsupervised Create a 2-dimensional cluster Evaluating clustering models Distance Evaluating clustering models Distance metrics are good for checking the pe rformance of clustering, but not always good f or business need. Treat Cluster as classification or scores for m odel evaluation. Validating Models Overfitting An overfit model looks great on training data and perform poorly on new data. Generation error is significantly greater than training error Avoid it by preferring simpler model KDD Cup Example The dataset has 230 facts about 50,000 credit card accounts and its about customer relationship manage ment. From these features, the goal was to predict account cancellation (called churn ), the innate tendency to use new products and services (called appetency ), and wil lingness to respond favorably to marketing pitches (ca lled upselling ). KDD Cup Example Build single-variable models Use only one variable at a time before scal e it into general modeling.  #### You've reached the end of your free preview.

Want to read all 52 pages?

• • •  