Hint Set a 0 0 b 1 1 and c 0 1 and p \u00bd Does it still satisfy the triangle

# Hint set a 0 0 b 1 1 and c 0 1 and p ½ does it still

• Homework Help
• 25

This preview shows page 13 - 21 out of 25 pages.

Hint: Set a = (0, 0) b = (1, 1) and c = (0, 1) and p = ½ Does it still satisfy the triangle inequality?
A word of caution For these distance metrics to work nicely, the attributes must be scaled before using them. We have done this earlier in neural net training. In many cases, you might want to weight each attribute differently. This is called weighted distance. Example of weighted Euclidean distance: 9 S A / , A B = T E D E / − D E B G H EIJ J/G where w a is weight for the a th attribute
Example:Data for UTD students' GPA (attribute) and getting internship (class) ispresented below:GPA2.62.82.853.13.23.33.43.553.63.73.754.04.0Internship0110110101001Using Manhattan distance, what will be the prediction of k-NN for astudent with GPA of 3.5 in the following cases:1.k = 12.k = 33.k = 5
Example: Data for UTD students' GPA (attribute) and getting internship (class) is presented below: GPA 2.6 2.8 2.85 3.1 3.2 3.3 3.4 3.55 3.6 3.7 3.75 4.0 4.0 Internship 0 1 1 0 1 1 0 1 0 1 0 0 1 Using Manhattan distance, what will be the prediction of k-NN for a student with GPA of 3.5 in the following cases: 1. k = 1: Nearest neighbor: {(3.55, 1)} => Majority class = 1 2. k = 3: Nearest neighbors: {(3.55, 1), (3.4, 0), (3.6, 0) } => Majority class = 0 3. k = 5: Nearest neighbors: {(3.55, 1), (3.4, 0), (3.6, 0), (3.3, 1), (3.6, 1) } => Majority class = 1
Handout Let's practice some questions from the handout.
Handout Let's practice some questions from the handout. Realization: - The calculations become more and more tedious as k increases and more so when dimensions increase. - More work is done during the testing phase than during the training phase.
Advantages and Disadvantages of k-NN Advantages: Training is very fast Can learn complex target functions easily Easy to program Disadvantages: Slow at query time Lots of storage and processing in memory Doesn't scale well in higher dimensions Easily tricked by noisy data items and irrelevant attributes Value of k can vary results significantly Curse of Dimensionality
k-NN for Continuous Output We presented k-NN for classification, but it can easily be used to approximate functions where the output is continuous.