Classifier_KNN_DT

# Classifier_KNN_DT - Classification Supervised learning and...

This preview shows pages 1–13. Sign up to view the full content.

4/14/11 Classification: Supervised learning, and Model Evaluation Classifier KNN, DT Feb 2011 Tommy W. S. Chow

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 K Nearest Neighbors l K Nearest Neighbors (KNN) l Advantage l Nonparametric architecture l Simple l Powerful l Requires no training time l Disadvantage l Memory intensive l Classification/estimation is slow
4/14/11 K-NN classifier schematic For a test instance, 1) Calculate distances from training pts. 2) Find K-nearest neighbours (say, K = 3) 3) Assign class label based on majority Classifying if the “blue” belongs to the class of “green” or “red”

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 The red points are one class The green points are another class The points with black circle are the three nearest neighbours around the grey point. Because there exists two red points in the three nearest neighbours, so the grey point point is classified as red class K-NN classifier schematic
4/14/11 KNN l Data : Numerical data, categorical data (non-numerical but has distance in some sense), & ordinal data (non- numerical and has no distance in any sense, i.e., color red, black, shape round, square etc. l How to determine distances between values of categorical l attributes? l Alternatives: l Use Boolean distance (1 if the same, 0 if different) l Introduce Differential grading (e.g. weather – ‘drizzling’ and ‘rainy’ are closer than ‘rainy’ and ‘sunny’ )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 How to determine the value of “K” l This is another practical issue cannot be resolved easily. l But we can determine K experimentally. Use the K that gives min error in a test set.
4/14/11 How good is KNN? l Normally works well for simple clean data set l But suffer from noisy data l Computationally demanding as it needs to calculate lots of distances and comparing l First use maybe around 60% of the data set as training data set l Verify the KNN by using the rest 40% as test set l Accuracy ok, then use it in real application

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 K Nearest Neighbors l The key issues involved in training this model includes setting l the variable K l Validation techniques (ex. Cross validation) l the type of distant metric l Euclidean distance measure: measure the distance between the test point Y and all other data points of different classes, X . l Find the shortest K , i.e., 3, 5 or 7, distances
4/14/11 Train data No Attrib class label Train data No Attrib Class label Train data No Attrib Class label 1 (2.3,1.6) 1 11 (0.7,4.8) 2 21 (4.4,4.2) 3 2 (2.1,1.2) 1 12 (0.8,4.2) 2 22 (4.9,4.2) 3 3 (2.3,1.3) 1 13 (0.2,4.7) 2 23 (4.5,4.6) 3 4 (2.2,1.2) 1 14 (0.2,4.8) 2 24 (4.1,4.0) 3 5 (2.7, 1.0) 1 15 (0.3,4.4) 2 25 (4.7,4.3) 3 6 (2.1,1.4) 1 16 (0.7,4.5) 2 26 (4.5,4.4) 3 7 (2.0,1.6) 1 17 (1.0,4.6) 2 27 (4.6,4.5) 3 8 (2.4,1.1) 1 18 (0.6,4.6) 2 28 (4.7,4.1) 3 9 (2.7,1.6) 1 19 (0.5,4.3) 2 29 (4.1,4.4) 3 10 (2.6,1.9) 1 20 (0.8,4.3) 2 30 (4.5,4.8) 3 Training Set

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 Test Set Test data Attrib Test data Attrib Test data Attrib 31 (2.4,3.6) 36 (3.7,3.8) 41 (2.9,3.2) 32 (2.8,2.9) 37 (3.8,3.5) 42 (2.6,3.2) 33 (2.3,3.0) 38 (3.3,3.7) 43 (2.0,3.6) 34 (2.2,3.2) 39 (3.2,3.8) 44 (2.1,3.1) 35 (2.7,3.7) 40 (3.3,3.4) 45 (2.4,3.3)
4/14/11 Here, k=5 KNN classification example

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/14/11 KNN when K = 15 l The decision boundary can be irregular.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern