lecture23 - Data Mining CS57300 Purdue University December...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Data Mining CS57300 Purdue University December 7, 2010 Announcements Qualifer: Dec 13 9-10pm (aFter the fnal) inal report: deadline extended to Dec 15th 4pm Please complete online student evaluations! Data mining systems How to choose a data mining system Commercial data mining systems have little in common Different data mining functionality or methodology May even work with completely different kinds of data Need to consider multiple dimensions in selection Data types: relational, transactional, sequential, spatial? Data sources: ASCII text Fles? multiple relational data sources? support open database connectivity (ODBC) connections? System issues: running on only one or on several operating systems? a client/server architecture? provide Web-based interfaces and allow XML data as I/O? Choosing a system Dimensions (cont): Data mining functions and methodologies One vs. multiple data mining functions One vs. variety of methods per function More functions and methods per function provide the user with greater Fexibility and analysis power Coupling with DB and/or data warehouse systems our forms of coupling: no coupling, loose coupling, semitight coupling, and tight coupling Ideally, a data mining system should be tightly coupled with a database system Choosing a system Dimensions (cont): Scalability: Row-based (or database size)? Column-based (or dimension) Curse of dimensionality: it is much more challenging to make a system column scalable that row scalable Visualization tools A picture is worth a thousand words Data visualization, mining result visualization, mining process visualization, and visual data mining Data mining query language and graphical user interface Easy-to-use and high-quality graphical user interface Essential for user-guided, highly interactive data mining Example data mining systems IBM InfoSphere Warehous Wide range of data mining algorithms Scalable mining algorithms Toolkits: OLAP, data preparation, data visualization tools, unstructured data analysis Tight integration with IBM's DB2 relational db system SAS Enterprise Miner A variety of statistical analysis tools Data warehouse tools and multiple data mining algorithms Easy to use GUI Example systems Microsoft SQL Server 2008 Integrate DB and OLAP with multiple mining methods Supports Object Linking and Embedding Database (OLEDB) -- access to wider formats of data than just ODBC Vero Insight MineSet Multiple data mining algorithms and advanced statistics Advanced visualization tools (originally developed by Silicon Graphics) PASW Modeler (SPSS) Integrated data mining development environment for end-users and developers Multiple data mining algorithms and visualization tools Example systems DBMiner (developed by Jiawei Han at SFU) Multiple data mining modules: discovery-driven OLAP analysis,...
View Full Document

This note was uploaded on 03/13/2012 for the course CS 573 taught by Professor Staff during the Fall '08 term at Purdue University-West Lafayette.

Page1 / 41

lecture23 - Data Mining CS57300 Purdue University December...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online