LT-2 - CHAPTER NO.2 Topics to be covered in this chapter...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
1 CHAPTER NO.2 Topics to be covered in this chapter Data vs information Data mining and machine learning Structural descriptions Rules: classification and association Decision trees Datasets Weather, contact lens, CPU performance, labour negotiation data, soybean classification Fielded applications Loan applications, screening images, market basket analysis Generalization as search Data mining and ethics 2
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Data vs. information Society produces huge amounts of data Sources: business, science, medicine, economics, geography, environment, sports, … Potentially valuable resource Raw data is useless: need techniques to automatically extract information from it Data: recorded facts Information: patterns underlying the data 3 Data mining Extracting previously unknown, potentially useful information from data Needed: programs that detect patterns and regularities in the data Strong patterns good predictions Problem 1: most patterns are not interesting Problem 2: patterns may be inexact (or spurious) Problem 3: data may be garbled or missing 4
Background image of page 2
3 The weather problem Outlook Temp Humidity Windy Class Overcast 72 90 TRUE Play Overcast 83 78 FALSE Play Overcast 64 65 TRUE Play Overcast 81 75 FALSE Play Rain 71 80 TRUE Don’t play Rain 65 70 TRUE Don’t play Rain 75 80 FALSE Play Rain 68 80 FALSE Play Rain 70 96 FALSE Play Sunny 75 70 TRUE Play Sunny 80 90 TRUE Don’t play Sunny 85 85 FALSE Don’t play Sunny 72 95 FALSE Don’t play Sunny 69 70 FALSE Play 5 Ross Quinlan Machine learning researcher from 1970’s University of Sydney, Australia 1986 “Induction of decision trees” ML Journal 1993 C4.5: Programs for machine learning Published by Morgan Kaufmann 6
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Classification vs. association rules Classification rule: predicts value of a given attribute (the classification of an example) Association rule: predicts value of arbitrary attribute (or combination) If outlook = sunny and humidity = high then play = no If temperature = cool then humidity = normal If humidity = normal and windy = false then play = yes If outlook = sunny and play = no then humidity = high
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 13

LT-2 - CHAPTER NO.2 Topics to be covered in this chapter...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online