AUC 097 Evaluating probability models Log Likelihood Logarithm of the product

Auc 097 evaluating probability models log likelihood

This preview shows page 19 - 33 out of 52 pages.

AUC = 0.97
Image of page 19
Evaluating probability models Log Likelihood Logarithm of the product of the probability the model assigne d to each example Log likelihood the model assigns to the test data, model is better as we get closer to 0 Log likelihood rescaled by the number of data points to give us a rough average surprise per data point.
Image of page 20
Evaluating probability models Log Likelihood The spam model assigns a log likelihood of -134.9478, which is much better than the null model’s -306.8952.
Image of page 21
Evaluating probability models Deviance The deviance is defined as -2*(logLikelihood-S), where S is a technical constant called “the log likelihood of the saturated model.” The lower the residual deviance, the better the model. We’re most concerned with differences of deviance, such as t he difference between the null deviance and the model devia nce In our case, this difference is -2*(-306.8952-S) - -2*(-134.9 478- S)=344.9.
Image of page 22
Evaluating probability models Akaike information criterion (AIC) AIC = deviance + 2*numberOfParameters AIC is deviance penalized for model complexity Useful for comparing models with different measur es of complexity and variables with differing numbe r of levels.
Image of page 23
Evaluating probability models Entropy Technical measure of information or surprise, and is measur ed in a unit called bits . Conditional entropy is a measure that gives an indication of h ow good the prediction is on different categories, tempered b y how often it predicts different categories. Initial entropy is 0.97 bits per example, a lot of surprise. The conditional entropy is only 0.39 bits per example.
Image of page 24
Evaluating ranking models Ranking models Given a set of examples, sort the rows or assign ra nks to the rows. Often trained by converting groups of examples int o many pair-wise decision. Evaluation method Method used for evaluating classifiers Spearman’s rank correlation coefficient Data mining concepts of lift
Image of page 25
Evaluating clustering models Hard to evaluate because it is unsupervised Create a 2-dimensional cluster
Image of page 26
Evaluating clustering models Distance
Image of page 27
Evaluating clustering models Distance metrics are good for checking the pe rformance of clustering, but not always good f or business need. Treat Cluster as classification or scores for m odel evaluation.
Image of page 28
Validating Models
Image of page 29
Overfitting An overfit model looks great on training data and perform poorly on new data. Generation error is significantly greater than training error Avoid it by preferring simpler model
Image of page 30
KDD Cup Example The dataset has 230 facts about 50,000 credit card accounts and its about customer relationship manage ment. From these features, the goal was to predict account cancellation (called churn ), the innate tendency to use new products and services (called appetency ), and wil lingness to respond favorably to marketing pitches (ca lled upselling ).
Image of page 31
KDD Cup Example Build single-variable models Use only one variable at a time before scal e it into general modeling.
Image of page 32
Image of page 33

You've reached the end of your free preview.

Want to read all 52 pages?

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes