{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

lecture02-classification

# lecture02-classification - CSE 572/CBS 572 Data Mining...

This preview shows pages 1–13. Sign up to view the full content.

Click to edit Master subtitle style Lecture Note 2 Classification – Introduction & Decision Tree By Gabriel Fung, PhD CSE 572/CBS 572: Data Mining

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Outline Introduction Classification Process Classification Model: Decision Tree 2P. 2
Click to edit Master subtitle style Introduction 3P. 3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Classification A simple classification problem… I know there are Salmon in this river. When I pick up a fish from this river, can you tell me whether this fish is Salmon ? Assume that you do not know how a Salmon looks like Then… How to solve this problem? 4P. 4
Classification Since you know nothing about Salmon and Tuna, the first thing you need to do is of course…LEARNING! Two types of learning 1. Passive learning 2. Active learning 5P. 5

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Different Kinds of Learning Passive learning 1. Find an expert. 2. The expert tells you all the characteristics of Salmon. 3. You simply memorize and apply what you have learned. Active learning 1. Find an expert. 2. The expert catches a lot of Fishes. 3. The experts only tells you which of them are Salmon, but does not tell you its characteristics. 4. You need to identify its characteristics by yourself by observing its features. 6P. 6
Classification in Data Mining In data mining, we are always interested in active learning 1. You are an expert. 2. You catch a lot of Fishes. 3. You only tells the computer which of them are Salmon, but does not tell the computer its characteristics. 4. The computer tries to learn everything itself via a model We will discuss what is a model shortly 7P. 7

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Always Remember… From the data mining point of view… Classification = Prediction = Forecasting This is because the techniques are the same Classification is also known as “Supervised Learning” There must be an “expert” (you) to “supervise” the computer. In contrast, Clustering is known as “Unsupervised Learning”. Clustering will be discussed in some later lectures. 8P. 8
Click to edit Master subtitle style Classification Process 9P. 9

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Terminologies Recall: 1. You catch a lot of Fishes. 2. You tells the computer which of them are Salmon. 3. The computer identifies salmon’s characteristics. Terminologies: –. Examples – The fishes that you have caught. –. Class – Salmon and Not Salmon. –. Positive examples – Fishes that belong to the class Salmon. –. Negative examples – Fishes that do not belong to the class Salmon. –. Model – What the computer has learned. The accuracy of the model depends on the learning algorithm. 10P. 10
Learning and Operation Learning: Operation: 11P. 11 Model ID Color Size Label 1 Pink 20cm Salmon 2 Green 30cm Not Salmon : : : : : : N Pink 18cm Salmon 1. Archive Training Data 2. Choose an learning algorithm Model A new fish Yes (Salmon) No (Not a Salmon)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
1212 Binary-Class vs. Multi-Class Binary-Class Classification Only two classes exists.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}