Classifier_KNN_DT - Classification Supervised learning and...

Info icon This preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon
4/14/11 Classification: Supervised learning, and Model Evaluation Classifier KNN, DT Feb 2011 Tommy W. S. Chow
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4/14/11 K Nearest Neighbors l K Nearest Neighbors (KNN) l Advantage l Nonparametric architecture l Simple l Powerful l Requires no training time l Disadvantage l Memory intensive l Classification/estimation is slow
Image of page 2
4/14/11 K-NN classifier schematic For a test instance, 1) Calculate distances from training pts. 2) Find K-nearest neighbours (say, K = 3) 3) Assign class label based on majority Classifying if the “blue” belongs to the class of “green” or “red”
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4/14/11 The red points are one class The green points are another class The points with black circle are the three nearest neighbours around the grey point. Because there exists two red points in the three nearest neighbours, so the grey point point is classified as red class K-NN classifier schematic
Image of page 4
4/14/11 KNN l Data : Numerical data, categorical data (non-numerical but has distance in some sense), & ordinal data (non- numerical and has no distance in any sense, i.e., color red, black, shape round, square etc. l How to determine distances between values of categorical l attributes? l Alternatives: l Use Boolean distance (1 if the same, 0 if different) l Introduce Differential grading (e.g. weather – ‘drizzling’ and ‘rainy’ are closer than ‘rainy’ and ‘sunny’ )
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4/14/11 How to determine the value of “K” l This is another practical issue cannot be resolved easily. l But we can determine K experimentally. Use the K that gives min error in a test set.
Image of page 6
4/14/11 How good is KNN? l Normally works well for simple clean data set l But suffer from noisy data l Computationally demanding as it needs to calculate lots of distances and comparing l First use maybe around 60% of the data set as training data set l Verify the KNN by using the rest 40% as test set l Accuracy ok, then use it in real application
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4/14/11 K Nearest Neighbors l The key issues involved in training this model includes setting l the variable K l Validation techniques (ex. Cross validation) l the type of distant metric l Euclidean distance measure: measure the distance between the test point Y and all other data points of different classes, X . l Find the shortest K , i.e., 3, 5 or 7, distances
Image of page 8
4/14/11 Train data No Attrib class label Train data No Attrib Class label Train data No Attrib Class label 1 (2.3,1.6) 1 11 (0.7,4.8) 2 21 (4.4,4.2) 3 2 (2.1,1.2) 1 12 (0.8,4.2) 2 22 (4.9,4.2) 3 3 (2.3,1.3) 1 13 (0.2,4.7) 2 23 (4.5,4.6) 3 4 (2.2,1.2) 1 14 (0.2,4.8) 2 24 (4.1,4.0) 3 5 (2.7, 1.0) 1 15 (0.3,4.4) 2 25 (4.7,4.3) 3 6 (2.1,1.4) 1 16 (0.7,4.5) 2 26 (4.5,4.4) 3 7 (2.0,1.6) 1 17 (1.0,4.6) 2 27 (4.6,4.5) 3 8 (2.4,1.1) 1 18 (0.6,4.6) 2 28 (4.7,4.1) 3 9 (2.7,1.6) 1 19 (0.5,4.3) 2 29 (4.1,4.4) 3 10 (2.6,1.9) 1 20 (0.8,4.3) 2 30 (4.5,4.8) 3 Training Set
Image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4/14/11 Test Set Test data Attrib Test data Attrib Test data Attrib 31 (2.4,3.6) 36 (3.7,3.8) 41 (2.9,3.2) 32 (2.8,2.9) 37 (3.8,3.5) 42 (2.6,3.2) 33 (2.3,3.0) 38 (3.3,3.7) 43 (2.0,3.6) 34 (2.2,3.2) 39 (3.2,3.8) 44 (2.1,3.1) 35 (2.7,3.7) 40 (3.3,3.4) 45 (2.4,3.3)
Image of page 10
4/14/11 Here, k=5 KNN classification example
Image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4/14/11 KNN when K = 15 l The decision boundary can be irregular.
Image of page 12
Image of page 13
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern