This preview shows page 1. Sign up to view the full content.
Unformatted text preview: e number of hidden units is equal to the number of clusters, which is
, we set it equal for all clusters.
The basic details for K- means clustering are given:
The K initial centers are randomly chosen from the training data.
Then the following two steps are iterated alternately until convergence.
1. For each existing center, we re- identify its cluster (every point in this cluster should be closer to this center than to others).
2. Compute the mean for each cluster and make it as the new center for each cluster. Kmeans Clus tering alg orithm For a given cluster assignment , the total cluster variance is minimized with respect to wikicour senote.com/w/index.php?title= Stat841&pr intable= yes yielding the means of the currently assigned clusters.
57/74 10/09/2013 Given a current set of means Stat841 - Wiki Cour se Notes , cluster mean. That is, is minimized by assigning each observation to the closet (current)
. Steps 1 and 2 are iterated until the assignments do not change.
Example: Partition data into 2 clusters (2 hidden values) > Xrn(08)
We create X randomly as a training set with 80 data points and 30 dimensions, and then apply “kmeans” method to separate X into 2 clusters. IDX is a vector contains 1 or
2 which indicates 2 clusters, and its size is 30*1.
is the center (mean) of each cluster with size 2*80; sumD is sum of the square distance between the data points and
center of its cluster. The
indicate the number of data points in cluster 1 and 2.
is the variance of the first cluster
is the variance of the
. Now we can get , , hat matrix
by following equations. Finally...
View Full Document
This document was uploaded on 03/07/2014.
- Winter '13