fa13-cs188-lecture-22-1PP

What can go wrong various schemes for prevenng this

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: lustering   Basic idea: group together similar instances   Example: 2D point paOerns   What could “similar” mean?   One op)on: small (squared) Euclidean distance K ­Means K ­Means   An itera)ve clustering algorithm   Pick K random points as cluster centers (means)   Alternate:   Assign data instances to closest mean   Assign each mean to the average of its assigned points   Stop when no points’ assignments change K ­Means Example K ­Means as Op)miza)on   Consider the total distance to the means: means points assignments   Each itera)on reduces phi   Two stages each itera)on:   Update assignments: fix means c, change assignments a   Update means: fix assignments a, change means c Phase I: Update Assignments   For each point, re ­assign to closest mean:   Can only decrease total distance phi! Phase II: Update Means   Move each mean to the average of its assigned points:   Also can only decrease total distance… (Why?)   Fun fact: the point y with minimum squared Euclidean distance to a set of points {x} is their mean Ini)aliza)on   K ­means is non ­determinis)c   Requires ini)al means   It does maOer what you pick!   What can go wrong?   Various schem...
View Full Document

Ask a homework question - tutors are online