cs229-notes7a

cs229-notes7a - CS229 Lecture notes Andrew Ng The k -means...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
CS229 Lecture notes Andrew Ng The k -means clustering algorithm In the clustering problem, we are given a training set { x (1) , . . . , x ( m ) } , and want to group the data into a few cohesive “clusters.” Here, x ( i ) R n as usual; but no labels y ( i ) are given. So, this is an unsupervised learning problem. The k -means clustering algorithm is as follows: 1. Initialize cluster centroids μ 1 , μ 2 , . . . , μ k R n randomly. 2. Repeat until convergence: { For every i , set c ( i ) := arg min j || x ( i ) - μ j || 2 . For each j , set μ j := m i =1 1 { c ( i ) = j } x ( i ) m i =1 1 { c ( i ) = j } . } In the algorithm above, k (a parameter of the algorithm) is the number of clusters we want to ±nd; and the cluster centroids μ j represent our current guesses for the positions of the centers of the clusters. To initialize the cluster centroids (in step 1 of the algorithm above), we could choose k training examples randomly, and set the cluster centroids to be equal to the values of
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/24/2010 for the course CS 229 at Stanford.

Page1 / 3

cs229-notes7a - CS229 Lecture notes Andrew Ng The k -means...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online