CSCi 4150 Unsupervised Learning

# CSCi 4150 Unsupervised Learning - Unsupervised learning...

• karthikmail1989
• 8

This preview shows pages 1–4. Sign up to view the full content.

Unsupervised learning 10 10.1 K-maens clustering CS229 Lecture notes by Andrew Ng 10.2 Mixture of Gaussians and EM algorithm CS229 Lecture notes by Andrew Ng 10.3 The Boltzmann and Helmholts machines

This preview has intentionally blurred sections. Sign up to view the full version.

CS229 Lecture notes Andrew Ng The k -means clustering algorithm In the clustering problem, we are given a training set { x (1) , . . . , x ( m ) } , and want to group the data into a few cohesive “clusters.” Here, x ( i ) R n as usual; but no labels y ( i ) are given. So, this is an unsupervised learning problem. The k -means clustering algorithm is as follows: 1. Initialize cluster centroids μ 1 , μ 2 , . . . , μ k R n randomly. 2. Repeat until convergence: { For every i , set c ( i ) := arg min j || x ( i ) - μ j || 2 . For each j , set μ j := m i =1 1 { c ( i ) = j } x ( i ) m i =1 1 { c ( i ) = j } . } In the algorithm above, k (a parameter of the algorithm) is the number of clusters we want to find; and the cluster centroids μ j represent our current guesses for the positions of the centers of the clusters. To initialize the cluster centroids (in step 1 of the algorithm above), we could choose k training examples randomly, and set the cluster centroids to be equal to the values of these k examples. (Other initialization methods are also possible.) The inner-loop of the algorithm repeatedly carries out two steps: (i) “Assigning” each training example x ( i ) to the closest cluster centroid μ j , and (ii) Moving each cluster centroid μ j to the mean of the points assigned to it. Figure 1 shows an illustration of running k -means. 1
2 (a) (b) (c) (d) (e) (f) Figure 1: K-means algorithm. Training examples are shown as dots, and cluster centroids are shown as crosses. (a) Original dataset. (b) Random ini- tial cluster centroids (in this instance, not chosen to be equal to two training examples). (c-f) Illustration of running two iterations of k -means. In each iteration, we assign each training example to the closest cluster centroid (shown by “painting” the training examples the same color as the cluster centroid to which is assigned); then we move each cluster centroid to the mean of the points assigned to it. (Best viewed in color.) Images courtesy Michael Jordan.

This preview has intentionally blurred sections. Sign up to view the full version.

This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern