lecture5 - Data Mining CS57300 Purdue University September...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining CS57300 Purdue University September 9, 2010
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Data exploration and visualization
Background image of page 2
Visualization • Human eye/brain have evolved powerful methods to detect structure in nature • Display data in ways that exploit human pattern recognition abilities • Limitation: Can be difFcult to apply if data size (number of dimensions or instances) is large
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Exploratory data analysis • Data analysis approach that employs a number of (mostly graphical) techniques to: • Maximize insight into data • Uncover underlying structure • Identify important variables • Detect outliers and anomalies • Test underlying modeling assumptions • Develop parsimonious models • Generate hypotheses from data
Background image of page 4
Techniques • Low-dimensional data • Summarizing data with simple statistics • Plotting raw data (1D, 2D, 3D) • Higher-dimensional data • Principal component analysis • Multidimensional scaling
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
• Measures of location • Mean: • Median: value with 50% of points above and below • Quartile: value with 25% (75%) points above and below • Mode: most common value Data summarization ˆ μ = 1 n n i =1 x ( i )
Background image of page 6
• Measures of dispersion or variability • Variance: • Standard deviation: • Range: difference between max and min point • Interquartile range: difference between 1 st and 3 rd Q • Skew: Data summarization ˆ σ 2 k = 1 n n i =1 ( x ( i ) - μ ) 2 ˆ σ k = ± 1 n n i =1 ( x ( i ) - μ ) 2 P n i =1 ( x ( i ) - ˆ μ ) 3 ( P n i =1 ( x ( i ) - ˆ μ ) 2 ) 3 2
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/13/2012 for the course CS 573 taught by Professor Staff during the Fall '08 term at Purdue University-West Lafayette.

Page1 / 36

lecture5 - Data Mining CS57300 Purdue University September...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online