Classifier_KNN_DT - Classification: Supervised learning,...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
4/14/11 Classification: Supervised learning, and Model Evaluation Classifier KNN, DT Feb 2011 Tommy W. S. Chow
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4/14/11 K Nearest Neighbors l K Nearest Neighbors (KNN) l Advantage l Nonparametric architecture l Simple l Powerful l Requires no training time l Disadvantage l Memory intensive l Classification/estimation is slow
Background image of page 2
4/14/11 K-NN classifier schematic For a test instance, 1) Calculate distances from training pts. 2) Find K-nearest neighbours (say, K = 3) 3) Assign class label based on majority Classifying if the “blue” belongs to the class of “green” or “red”
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4/14/11 The red points are one class The green points are another class The points with black circle are the three nearest neighbours around the grey point. Because there exists two red points in the three nearest neighbours, so the grey point point is classified as red class K-NN classifier schematic
Background image of page 4
4/14/11 KNN l Data : Numerical data, categorical data (non-numerical but has distance in some sense), & ordinal data (non- numerical and has no distance in any sense, i.e., color red, black, shape round, square etc. l How to determine distances between values of categorical l attributes? l Alternatives: l Use Boolean distance (1 if the same, 0 if different) l Introduce Differential grading (e.g. weather – ‘drizzling’ and ‘rainy’ are closer than ‘rainy’ and ‘sunny’ )
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4/14/11 How to determine the value of “K” l This is another practical issue cannot be resolved easily. l But we can determine K experimentally. Use the K that gives min error in a test set.
Background image of page 6
4/14/11 How good is KNN? l Normally works well for simple clean data set l But suffer from noisy data l Computationally demanding as it needs to calculate lots of distances and comparing l First use maybe around 60% of the data set as training data set l Verify the KNN by using the rest 40% as test set l Accuracy ok, then use it in real application
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4/14/11 K Nearest Neighbors l The key issues involved in training this model includes setting l the variable K l Validation techniques (ex. Cross validation) l the type of distant metric l Euclidean distance measure: measure the distance between the test point Y and all other data points of different classes, X . l Find the shortest K , i.e., 3, 5 or 7, distances
Background image of page 8
4/14/11 Train data No Attrib class label Train data No Attrib Class label Train data No Attrib Class label 1 (2.3,1.6) 1 11 (0.7,4.8) 2 21 (4.4,4.2) 3 2 (2.1,1.2) 1 12 (0.8,4.2) 2 22 (4.9,4.2) 3 3 (2.3,1.3) 1 13 (0.2,4.7) 2 23 (4.5,4.6) 3 4 (2.2,1.2) 1 14 (0.2,4.8) 2 24 (4.1,4.0) 3 5 (2.7, 1.0) 1 15 (0.3,4.4) 2 25 (4.7,4.3) 3 6 (2.1,1.4) 1 16 (0.7,4.5) 2 26 (4.5,4.4) 3 7 (2.0,1.6) 1 17 (1.0,4.6) 2 27 (4.6,4.5) 3 8 (2.4,1.1) 1 18 (0.6,4.6) 2 28 (4.7,4.1) 3 9 (2.7,1.6) 1 19 (0.5,4.3) 2 29 (4.1,4.4) 3 10 (2.6,1.9) 1 20 (0.8,4.3) 2 30 (4.5,4.8) 3 Training Set
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Test Set Test data Attrib Test data Attrib Test data Attrib 31 (2.4,3.6) 36 (3.7,3.8) 41 (2.9,3.2) 32 (2.8,2.9) 37 (3.8,3.5) 42 (2.6,3.2) 33 (2.3,3.0)
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/14/2011 for the course EE 4146 taught by Professor Tommychow during the Spring '11 term at City University of Hong Kong.

Page1 / 41

Classifier_KNN_DT - Classification: Supervised learning,...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online