CSCi 4150 Unsupervised Learning

CSCi 4150 Unsupervised Learning - Unsupervised learning...

Info icon This preview shows pages 1–4. Sign up to view the full content.

Unsupervised learning 10 10.1 K-maens clustering CS229 Lecture notes by Andrew Ng 10.2 Mixture of Gaussians and EM algorithm CS229 Lecture notes by Andrew Ng 10.3 The Boltzmann and Helmholts machines
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

CS229 Lecture notes Andrew Ng The k -means clustering algorithm In the clustering problem, we are given a training set { x (1) , . . . , x ( m ) } , and want to group the data into a few cohesive “clusters.” Here, x ( i ) R n as usual; but no labels y ( i ) are given. So, this is an unsupervised learning problem. The k -means clustering algorithm is as follows: 1. Initialize cluster centroids μ 1 , μ 2 , . . . , μ k R n randomly. 2. Repeat until convergence: { For every i , set c ( i ) := arg min j || x ( i ) - μ j || 2 . For each j , set μ j := m i =1 1 { c ( i ) = j } x ( i ) m i =1 1 { c ( i ) = j } . } In the algorithm above, k (a parameter of the algorithm) is the number of clusters we want to find; and the cluster centroids μ j represent our current guesses for the positions of the centers of the clusters. To initialize the cluster centroids (in step 1 of the algorithm above), we could choose k training examples randomly, and set the cluster centroids to be equal to the values of these k examples. (Other initialization methods are also possible.) The inner-loop of the algorithm repeatedly carries out two steps: (i) “Assigning” each training example x ( i ) to the closest cluster centroid μ j , and (ii) Moving each cluster centroid μ j to the mean of the points assigned to it. Figure 1 shows an illustration of running k -means. 1
Image of page 2
2 (a) (b) (c) (d) (e) (f) Figure 1: K-means algorithm. Training examples are shown as dots, and cluster centroids are shown as crosses. (a) Original dataset. (b) Random ini- tial cluster centroids (in this instance, not chosen to be equal to two training examples). (c-f) Illustration of running two iterations of k -means. In each iteration, we assign each training example to the closest cluster centroid (shown by “painting” the training examples the same color as the cluster centroid to which is assigned); then we move each cluster centroid to the mean of the points assigned to it. (Best viewed in color.) Images courtesy Michael Jordan.
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern