Use some variables to predict unknown or prediction

Unformatted text preview: riables to predict unknown or Prediction future values of other variables. future Classification Regression Deviation Detection Description Methods. Find human-interpretable patterns that Description describe the data. describe Clustering Clustering Association Rule Discovery Sequential Pattern Discovery ISQS 6347, Data & Text Mining ISQS 15 YouTubeVideos What is data mining 3’32” What 3’32” SAS data mining 2’28” SAS SAS presentation – Fraud analytics 2’21” Market basket analysis 2’36” Market 2’36” Data Mining Process ISQS 6347, Data & Text Mining ISQS 17 What is Text Mining? Discover useful and previously unknown “gems” Discover of information in large text collections large Video 4’16” ISQS 6347, Data & Text Mining ISQS 18 Motivation for Text Mining Approximately 90% of the world’s data is held in Approximately 90% unstructured formats (source: Oracle Corporation) unstructured Information intensive business processes demand that we Information transcend from simple document retrieval to “knowledge” discovery. discovery. 10% 90% Structured Numerical or Coded Information Unstructured or Semi-structured Information ISQS 6347, Data & Text Mining ISQS 19 Text Mining Process Text Preprocessing Features Generation Features Simple Counting Statistics Statistics Text/Data Mining Bag of Words Bag Feature Selection Syntactic/Semantic Text Syntactic/Semantic Analysis ClassificationClassificationSupervised Learning Supervised Clustering- Unsupervised ClusteringLearning Learning Analyzing Results ISQS 6347, Data & Text Mining ISQS 20 Open Source Data Mining Software – Rapid Miner Formerly YALE (Yet Another Learning Environment), is an environment for machine learning, data mining, text mining, predictive analytics, and busin...
