lecture1

lecture1 - Data Mining CS57300 Purdue University Real-world...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining CS57300 Purdue University August 26, 2010
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Real-world example: 659,000 brokers 171,000 branches 5,100 firms 400,000 disclosures
Background image of page 2
Broker Age>27 Current CoWorker Count>8 Current Firm Avg(Size)>12 Current Branch Mode(Location)=NY Broker Years In Industry>1 Disclosure Count(Yr<1995)>0 Past Firm Avg(Size)>90 Current Regulator Mode(Status)=Reg Past CoWorker Count>35 Disclosure Count(Type=CC)>0 Disclosure Count>5 Past Firm Max(Size)>1000 Current Branch Mode(Location)=AZ Past CoWorker Count(Gender=M)>15 703 564 179 49 9 218 7 63 34 24 9 5 54 200 10 Broker Age>27 Current Firm Avg(Size)>12 Disclosure Count(Yr<1995)>0 Past Firm Avg(Size)>90 Current Regulator Mode(Status)=Reg Past CoWorker Count>35 Disclosure Count(Type=CC)>0 0 10
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Neither Both NASD Rules Relational Models Performance of NASD models "One broker I was highly conFdent in ranking as 5… Not only did I have the pleasure of meeting him at a shady warehouse location, I also negotiated his bar from the industry. .. This person actually used investors' funds to pay for personal expenses including his trip to attend a NASD compliance conference! …If the model predicted this person, it would be right on target."
Background image of page 4
Elements of Data Mining Algorithms
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
adapted from: U. Fayyad, et al. (1995), “From Knowledge Discovery to Data Mining: An Overview,” Advances in Knowledge Discovery and Data Mining, U. Fayyad et al. (Eds.), AAAI/MIT Press Data Target Data Selection Knowledge Knowledge Preprocessed Data Patterns Data Mining Interpretation/ Evaluation Preprocessing
Background image of page 6
Overview • Task specifcation • Data representation • Knowledge representation • Learning technique • Search + scoring • InFerence and/or interpretation
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Overview • Task specifcation • Data representation • Knowledge representation • Learning technique • Search + scoring • InFerence and/or interpretation
Background image of page 8
Task specifcation • Description of the characteristics of the analysis and desired result • Examples: • From a set of labeled examples , devise an understandable model that will accurately predict whether a stockbroker will commit fraud in the near future. • From a set of unlabeled examples , cluster stockbrokers into a set of homogeneous groups based on their demographic information
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Exploratory data analysis • Goal • Interact with data without clear objective • Techniques • Visualization, adhoc modeling
Background image of page 10
Descriptive modeling • Goal
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 41

lecture1 - Data Mining CS57300 Purdue University Real-world...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online