Classifier_KNN_DT

# Classifier_KNN_DT - Classification: Supervised learning,...

This preview shows pages 1–11. Sign up to view the full content.

4/14/11 Classification: Supervised learning, and Model Evaluation Classifier KNN, DT Feb 2011 Tommy W. S. Chow

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 K Nearest Neighbors l K Nearest Neighbors (KNN) l Advantage l Nonparametric architecture l Simple l Powerful l Requires no training time l Disadvantage l Memory intensive l Classification/estimation is slow
4/14/11 K-NN classifier schematic For a test instance, 1) Calculate distances from training pts. 2) Find K-nearest neighbours (say, K = 3) 3) Assign class label based on majority Classifying if the “blue” belongs to the class of “green” or “red”

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 The red points are one class The green points are another class The points with black circle are the three nearest neighbours around the grey point. Because there exists two red points in the three nearest neighbours, so the grey point point is classified as red class K-NN classifier schematic
4/14/11 KNN l Data : Numerical data, categorical data (non-numerical but has distance in some sense), & ordinal data (non- numerical and has no distance in any sense, i.e., color red, black, shape round, square etc. l How to determine distances between values of categorical l attributes? l Alternatives: l Use Boolean distance (1 if the same, 0 if different) l Introduce Differential grading (e.g. weather – ‘drizzling’ and ‘rainy’ are closer than ‘rainy’ and ‘sunny’ )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 How to determine the value of “K” l This is another practical issue cannot be resolved easily. l But we can determine K experimentally. Use the K that gives min error in a test set.
4/14/11 How good is KNN? l Normally works well for simple clean data set l But suffer from noisy data l Computationally demanding as it needs to calculate lots of distances and comparing l First use maybe around 60% of the data set as training data set l Verify the KNN by using the rest 40% as test set l Accuracy ok, then use it in real application

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 K Nearest Neighbors l The key issues involved in training this model includes setting l the variable K l Validation techniques (ex. Cross validation) l the type of distant metric l Euclidean distance measure: measure the distance between the test point Y and all other data points of different classes, X . l Find the shortest K , i.e., 3, 5 or 7, distances
4/14/11 Train data No Attrib class label Train data No Attrib Class label Train data No Attrib Class label 1 (2.3,1.6) 1 11 (0.7,4.8) 2 21 (4.4,4.2) 3 2 (2.1,1.2) 1 12 (0.8,4.2) 2 22 (4.9,4.2) 3 3 (2.3,1.3) 1 13 (0.2,4.7) 2 23 (4.5,4.6) 3 4 (2.2,1.2) 1 14 (0.2,4.8) 2 24 (4.1,4.0) 3 5 (2.7, 1.0) 1 15 (0.3,4.4) 2 25 (4.7,4.3) 3 6 (2.1,1.4) 1 16 (0.7,4.5) 2 26 (4.5,4.4) 3 7 (2.0,1.6) 1 17 (1.0,4.6) 2 27 (4.6,4.5) 3 8 (2.4,1.1) 1 18 (0.6,4.6) 2 28 (4.7,4.1) 3 9 (2.7,1.6) 1 19 (0.5,4.3) 2 29 (4.1,4.4) 3 10 (2.6,1.9) 1 20 (0.8,4.3) 2 30 (4.5,4.8) 3 Training Set

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Test Set Test data Attrib Test data Attrib Test data Attrib 31 (2.4,3.6) 36 (3.7,3.8) 41 (2.9,3.2) 32 (2.8,2.9) 37 (3.8,3.5) 42 (2.6,3.2) 33 (2.3,3.0)
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 04/14/2011 for the course EE 4146 taught by Professor Tommychow during the Spring '11 term at City University of Hong Kong.

### Page1 / 41

Classifier_KNN_DT - Classification: Supervised learning,...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online