# ML8-DBSCAN.pdf - Syed Ali Raza Lecturer GC University...

• 26

This preview shows page 1 - 10 out of 26 pages.

Syed Ali Raza Lecturer GC University Lahore
DENSITY BASED CLUSTERING Density based clustering algorithms make an assumption that clusters are dense regions in space separated by regions of lower density. A dense cluster is a region which is “density connected”, i.e. the density of points in that region is greater than a minimum. Since these algorithms expand clusters based on dense connectivity, they can find clusters of arbitrary shapes. DBSCAN is an example of density based clustering algorithm.
DBSCAN: SENSITIVE TO PARAMETERS
DBSCAN: CORE, BORDER AND NOISE POINTS Original Points Point types: core , border and noise Eps = 10, MinPts = 4
WHEN DBSCAN WORK WELL Original Points Resistant to Noise Can handle clusters of different shapes and sizes Clusters
WHEN DBSCAN DOES NOT WORK WELL Original Points (MinPts=4, Eps=9.75). (MinPts=4, Eps=9.92) Varying densities High-dimensional data
DBSCAN Epsilon neighborhood (N ε ) : set of all points within a distance ‘ε’. Core point : A point that has at least ‘ minPoint ’ (including itself) points within it’s N ε . Direct Density Reachable (DDR) : A point q is directly density reachable from a point p if p is core point and q N ε . Density Reachable (DR) : Two points are DR if there is a chain of DDR points that link these two points. Border Point: Point that are DDR but not a core point. Noise : Points that do not belong to any point’s N ε .
DBSCAN: CORE, BORDER, AND NOISE POINTS
DBSCAN ALGORITHM Label points as core , border and noise Eliminate noise points For every core point p that has not been assigned to a cluster Create a new cluster with the point p