1
K-Means Clustering
Given N observations {x
1
, …, x
N
} where x
∈
R
d
How do we partition the N observations into k
sets S = {S_1, …, S_k} where k < N?
In general:
Difficult problem whether working with identifying 2
d
1
clusters in R
OR identifying k clusters in 2-D
Often used in many applications
An Example
2
A Simple Algorithm
Given N observations
Let m
i
denote the centroid of the set S
i
for
i=1, …, k
Two steps:
1.
Assignment step
2.
Update step
3
Some Pseudocode
% Iterate k-means
means = initialGuesses;
while
true
% Compute cluster membership
membership = assign_clusters(means, data);
old_means = means;
% Update means based on new membership information
for
i = 1:k
means(i,:)
= mean(all x_j \in S_i);
end
% Decide if you are done
4

2
An Example
Matlab Demo
5
Our Initial Example
6
Applying k-means clustering
7
Localization
Where am I?