Stats 202 - Lecture 7

Stats 202 - Lecture 7 - Statistics 202 Statistical Aspects...

Info icon This preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Statistics 202: Statistical Aspects of Data Mining Professor Rajan Patel Lecture 7 = Start Chapter 4 Agenda: 1) Assign Homework 3 2) Start lecturing over Chapter 4
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 Introduction to Data Mining by Tan, Steinbach, Kumar Chapter 4: Classification: Basic Concepts, Decision Trees, and Model Evaluation
Image of page 2
3 Illustration of the Classification Task: Apply Model Induction Deduction Learn Model Model 7OG Attrib1 Attrib2 Attrib3 Class Yes Large !2$5K No !2 No Medium °°K No "3 No Small &7°K No #4 Yes Medium !2°K No $5 No Large (9$5K Yes %6 No Medium %6°K No &7 Yes Large !2!2°K No '8 No Small '8$5K Yes (9 No Medium &7$5K No ° No Small (9°K Yes 7OG Attrib1 Attrib2 Attrib3 Class No Small $5$5K ? !2 Yes Medium '8°K ? "3 Yes Large °K ? #4 No Small (9$5K ? $5 No Large %6&7K ? Test Set Learning algorithm Training Set Learning Algorithm Model
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4 Classification: Definition ± Given a collection of records ( ZYXWDOTOTJ YXHZY ) Each record contains a set of DZYZYXWOE[Z[ZYHYX ±^]^° , with one additional attribute which is the FRDYXYX ±_^_° . ± Find a SUGHR to VUXWHGOFZY the class as a function of the values of other attributes. ± Goal: previously unseen records should be assigned a class as accurately as possible. A ZYHYXZY YXHZY is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.
Image of page 4
5 Classification Examples ± Classifying credit card transactions as legitimate or fraudulent ± Classifying secondary structures of protein as alpha-helix, beta-sheet, or random coil ± Categorizing news stories as finance, weather, entertainment, sports, etc ± Predicting tumor cells as benign or malignant
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
6 Classification Techniques ± There are many techniques/algorithms for carrying out classification ± In this chapter we will study only GHFOYXOUT ZYXWHHYX ± In Chapter 5 we will study other techniques, including some very modern and effective techniques
Image of page 6
7 An Example of a Decision Tree G\W Refund Marital Status Taxable Income Cheat Yes Single !2$5K No !2 No Married °°K No "3 No Single &7°K No #4 Yes Married !2°K No $5 No Divorced (9$5K Yes %6 No Married %6°K No &7 Yes Divorced !2!2°K No '8 No Single '8$5K Yes (9 No Married &7$5K No ° No Single (9°K Yes categorical categorical continuous class Refund MarSt TaxInc YES NO NO NO Yes No Married Single± Divorced < '8°K > '8°K 6VUROZYZYOTJ <ZYZYXWOE[Z[ZYHYX Training Data Model: Decision Tree
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
8 Applying the Tree Model to Predict the Class for a New Observation Refund MarSt TaxInc YES NO NO NO Yes No Married Single± Divorced < '8°K > '8°K Refund Marital Status Taxable Income Cheat No Married '8°K ?
Image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern