Evaluation of Classifiers - Eco 6352 Applied Econometrics Spring 2017 Professor Tom Fomby Department of Economics SMU Note In the discussion that

Evaluation of Classifiers - Eco 6352 Applied Econometrics...

This preview shows page 1 - 8 out of 23 pages.

Eco 6352 Applied Econometrics Spring 2017 Professor Tom Fomby Department of Economics SMU
Image of page 1

Subscribe to view the full document.

Note: In the discussion that follows we are going to take the data partition sequence to be Training Data Set, then Validation Data Set, and, finally, the Test Data Set (the SAS EM and XLMINER Convention) . Also note that the terms “threshold”, “cut-off probability”, and “cut point” are used interchangeably in the literature to describe the point used for classifying a subject as a “positive” or a “negative”. It the probability of a positive for a subject is above the threshold, you classify the subject as a positive, otherwise you classify the subject as a negative. Usually a positive is labeled as “1” while a negative is labeled as “0”.
Image of page 2
Evaluation Methods are Dependent on Available Payoff Information and the Nature of the Problem at Hand. Here we assume no specific purpose for the classifier (like application to a Target Marketing Problem) Case I: Base the Performance of the Classifier on the Scoring of an Entire Hold-out Sample with no knowledge of the “Payoff” Matrix (This Power Point Presentation) Case II: At Least Some Information is known about the “Payoff” Matrix and the Classifier is chosen to maximize Payoff. (Next Power Point Presentation)
Image of page 3

Subscribe to view the full document.

Classification Matrix Predicted Value 1 0 Actual value 1 True Positive False Negative (Type I Error) 0 False Positive (Type II Error) True Negative
Image of page 4
Classification Matrix with Outcomes on a Hold-out Sample
Image of page 5

Subscribe to view the full document.

Classification Matrix with Outcomes on a Hold-out Sample Predicted Value 1 0 Actual value 1 n 11 n 10 0 n 01 n 00
Image of page 6
The Naïve Classifier It is based on a random classifier whose rating of cases is uninformative. That is, f(r >= t|y=1) = f(r >= t|y=0). The probability of the rating (r) of a “positive=1” subject exceeding the chosen threshold t is equal to the probability of the rating probability of a “negative=0” subject exceeding the chosen threshold t.
Image of page 7

Subscribe to view the full document.

Image of page 8
  • Spring '16
  • Econometrics, Type I and type II errors, roc curves, ROC curve

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes