lecture14

lecture14 - Data Mining CS57300 Purdue University October...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining CS57300 Purdue University October 26, 2010
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Descriptive modeling: representation
Background image of page 2
Data mining components • Task specifcation: Description • Data representation: Homogeneous IID data • Knowledge representation • Learning technique
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Descriptive models • Descriptive models summarize the data • Global summary • Model main features of the data • Two main approaches: • Cluster analysis • Density estimation
Background image of page 4
Modeling task • Data representation: training set of x (i) instances • Task—depends on approach • Clustering: partition the instances into groups of similar instances • Density estimation: determine a compact representation of the full joint distribution P( X )=P(X 1 ,X 2 ,...,X p )
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Cluster analysis • Decompose or partition instances into groups s.t.: • Intra-group similarity is high • Inter-group similarity is low • Measure of distance/similarity is crucial
Background image of page 6
Application examples • Marketing: discover distinct groups in customer base to develop targeted marketing programs • Land use: identify areas of similar use in an earth observation database to understand geographic similarities • City-planning: group houses according to house type, value, and location to identify “neighborhoods” • Earth-quake studies: Group observed earthquakes to see if they cluster along continent faults
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Algorithm examples • K-means clustering (partition-based) • Spectral clustering (hierarchical-divisive) • Nearest neighbor clustering (hierarchical-agglomerative) • Mixture models (probabilistic model-based)
Background image of page 8
K-means Groups represented by canonical item description(s)
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Nearest neighbor clustering Clustering represented with dendogram
Background image of page 10
Mixture models f ( x ) = K ± k =1 w k f k ( x ; θ ) Groups represented as mixture components (parameters and weights)
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 40

lecture14 - Data Mining CS57300 Purdue University October...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online