lecture02-classification

lecture02-classification - CSE 572/CBS 572: Data Mining...

Info iconThis preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon
Click to edit Master subtitle style Lecture Note 2 Classification – Introduction & Decision Tree By Gabriel Fung, PhD CSE 572/CBS 572: Data Mining
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outline Introduction Classification Process Classification Model: Decision Tree 2P. 2
Background image of page 2
Click to edit Master subtitle style Introduction 3P. 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Classification A simple classification problem… I know there are Salmon in this river. When I pick up a fish from this river, can you tell me whether this fish is Salmon ? Assume that you do not know how a Salmon looks like Then… How to solve this problem? 4P. 4
Background image of page 4
Classification Since you know nothing about Salmon and Tuna, the first thing you need to do is of course…LEARNING! Two types of learning 1. Passive learning 2. Active learning 5P. 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Different Kinds of Learning Passive learning 1. Find an expert. 2. The expert tells you all the characteristics of Salmon. 3. You simply memorize and apply what you have learned. Active learning 1. Find an expert. 2. The expert catches a lot of Fishes. 3. The experts only tells you which of them are Salmon, but does not tell you its characteristics. 4. You need to identify its characteristics by yourself by observing its features. 6P. 6
Background image of page 6
Classification in Data Mining In data mining, we are always interested in active learning 1. You are an expert. 2. You catch a lot of Fishes. 3. You only tells the computer which of them are Salmon, but does not tell the computer its characteristics. 4. The computer tries to learn everything itself via a model We will discuss what is a model shortly 7P. 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Always Remember… From the data mining point of view… Classification = Prediction = Forecasting This is because the techniques are the same Classification is also known as “Supervised Learning” There must be an “expert” (you) to “supervise” the computer. In contrast, Clustering is known as “Unsupervised Learning”. Clustering will be discussed in some later lectures. 8P. 8
Background image of page 8
Click to edit Master subtitle style Classification Process 9P. 9
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Terminologies Recall: 1. You catch a lot of Fishes. 2. You tells the computer which of them are Salmon. 3. The computer identifies salmon’s characteristics. Terminologies: –. Examples – The fishes that you have caught. –. Class – Salmon and Not Salmon. –. Positive examples – Fishes that belong to the class Salmon. –. Negative examples – Fishes that do not belong to the class Salmon. –. Model – What the computer has learned. The accuracy of the model depends on the learning algorithm. 10P. 10
Background image of page 10
Learning and Operation Learning: Operation: 11P. 11 Model ID Color Size Label 1 Pink 20cm Salmon 2 Green 30cm Not Salmon : : : : : : N Pink 18cm Salmon 1. Archive Training Data 2. Choose an learning algorithm Model A new fish Yes (Salmon) No (Not a Salmon)
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
1212 Binary-Class vs. Multi-Class Binary-Class Classification Only two classes exists.
Background image of page 12
Image of page 13
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/08/2010 for the course CS 420 taught by Professor Dawsonengler during the Spring '02 term at San Jose State University .

Page1 / 55

lecture02-classification - CSE 572/CBS 572: Data Mining...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online