cs229-notes7a

# cs229-notes7a - CS229 Lecture notes Andrew Ng The k-means...

This preview shows pages 1–2. Sign up to view the full content.

CS229 Lecture notes Andrew Ng The k -means clustering algorithm In the clustering problem, we are given a training set { x (1) ,...,x ( m ) } ,and want to group the data into a few cohesive “clusters.” Here, x ( i ) R n as usual; but no labels y ( i ) are given. So, this is an unsupervised learning problem. The k -means clustering algorithm is as follows: 1. Initialize cluster centroids μ 1 2 ,...,μ k R n randomly. 2. Repeat until convergence: { For every i , set c ( i ) := arg min j || x ( i ) - μ j || 2 . For each j , set μ j := m i =1 1 { c ( i ) = j } x ( i ) m i =1 1 { c ( i ) = j } . } In the algorithm above, k (a parameter of the algorithm) is the number of clusters we want to ±nd; and the cluster centroids μ j represent our current guesses for the positions of the centers of the clusters. To initialize the cluster centroids (in step 1 of the algorithm above), we could choose k training examples randomly, and set the cluster centroids to be equal to the values of

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 3

cs229-notes7a - CS229 Lecture notes Andrew Ng The k-means...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online