MiningFunctionalities

MiningFunctionalities - Data Mining Functionalities...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining Functionalities Classification and Clustering Association rule mining Outlier Detection Trend Analysis
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Data Mining Functionalities Classification and Prediction E.g., classify countries based on climate , or classify cars based on gas mileage Decision-trees, rule-based approaches Neural Networks, Support Vector Machines Prediction: Predict some unknown or missing numerical values
Background image of page 2
Data Mining Functionalities Cluster analysis Class label is unknown : Group data to form new classes, e.g., cluster houses to find distribution patterns Clustering based on the principle: maximizing the intra- class similarity and minimizing the interclass similarity Methods Exclusive Clustering Overlapping Clustering Hierarchical Clustering Probabilistic Clustering
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Clustering Unsupervised learning: Finds “natural” grouping of instances given un-labeled data
Background image of page 4
Clusters: exclusive vs. overlapping a k j i h g f e d c b Simple 2-D representation Non-overlapping Venn diagram Overlapping a k j i h g f e d c b
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Simple Clustering: K-means Works with numeric data only 1) Pick a number (K) of cluster centers (at random) 2) Assign every item to its nearest cluster center (e.g. using Euclidean distance) 3) Move each cluster center to the mean of its assigned items 4) Repeat steps 2,3 until convergence (change in cluster assignments less than a threshold)
Background image of page 6
K-means example, step 1 k 1 k 2 k 3 X Y Pick 3 initial cluster centers (randomly)
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
K-means example, step 2 k 1 k 2 k 3 X Y Assign each point to the closest cluster center
Background image of page 8
K-means example, step 3 X Y Move each cluster center to the mean of each cluster k 1 k 2 k 2 k 1 k 3 k 3
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
K-means example, step 4 X Y Reassign points closest to a different new cluster center Q: Which points are reassigned?
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 33

MiningFunctionalities - Data Mining Functionalities...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online