# class_10_26 - Statistical Data Mining ORIE 474 Fall 2007...

This preview shows pages 1–6. Sign up to view the full content.

Statistical Data Mining ORIE 474 Fall 2007 Tatiyana Apanasovich 10/26/07 Nearest Neighbor Models

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10.6 Nearest Neighbor (NN) Models Data: (x(i),c(i)), where i=1,…,n and c(i) in {c 1 ,…, c m } Distance function: d(x(i),x(j)) Model Structure: To classify a new object x 0 : 1. we examine the k closest points (nearest neighbors) to x 0 in the training data set. Denote them by x(i 1 ),…,x(i k ). 2. Assign the object to the class that has the majority of the points among these k
NN (cont’d) What means “k closest points to x 0 “? Think of a small volume of the space of input variables X, centered at x 0 , with the radius the distance to the k th nearest neighbor Subspace of variables in training data, centered at x 0 , with radius r: Distance to k th nearest neighbor r} ) d(x(i),x {x(i) x D r = 0 0 : data training ) ( } | ) ( :| min{ ) , ( 0 0 k x D r k x r r =

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
NN Model Structure ML estimator of the probability that a point in this small volume belongs to class c j : where (if there are more than one data points in the training set with equal distance to x 0 as a k th NN, choose at random which to include in D k (x 0 )) Hence, we have = = ) ( ) ( 0 , 0 ) ) ( ( 1 1 ) ( ˆ x D i x j k j k c i c k x p ) ( ˆ max arg where , ˆ 0 k j, ,..., 1 * 0 x p j* c c m j j = = = ) ( ) ( 0 ) , ( 0 0 x D x D k x r k =
NN (cont’d) Note: The MLE of the probability that a point in this small volume belongs to a certain class is

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/06/2011 for the course ORIE 474 taught by Professor Apanasovich during the Spring '07 term at Cornell University (Engineering School).

### Page1 / 15

class_10_26 - Statistical Data Mining ORIE 474 Fall 2007...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online