DSCI4520_DMIntro_1

DSCI4520_DMIntro_1 - DSCI 4520/5240 DATA MINING DSCI...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Lecture 1 - 1 DSCI 4520/5240 DATA MINING Some slide material taken from or inspired by: Groth, Han and Kamber, Cerrito, SAS DSCI 4520/5240 Data-Based Decision Support Systems (Data Mining) DSCI 4520/5240 DBDSS (DATA MINING)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture 1 - 2 DSCI 4520/5240 DATA MINING Introduction to DM (Sir Arthur Conan Doyle: Sherlock Holmes, "A Scandal in Bohemia") “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
Background image of page 2
Lecture 1 - 3 DSCI 4520/5240 DATA MINING Nobel Laureate Calls Data Mining "A Must" In an interview with ComputerWorld in January 1999, Dr. Penzias (won the 1978 Nobel Prize in physics and was the vice president and chief scientist at Bell Laboratories ) considered large scale data mining from very large databases as the key application for corporations in the next few years. In response to ComputerWorld's age-old question of "What will be the killer applications in the corporation?" Dr. Penzias replied: "Data mining." He then added: "Data mining will become much more important and companies will throw away nothing about their customers because it will be so valuable. If you're not doing this, you're out of business" he said. Regarding the systems implications of this trend, Dr. Penzias commented: "There will be huge databases everywhere. They will get bigger than processors, so you have to back them up in some mountain in Tennessee at night."
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture 1 - 4 DSCI 4520/5240 DATA MINING What Is Data Mining? Data mining (knowledge discovery in databases): A process of identifying hidden patterns and relationships within data (Groth) Data mining: Extraction of interesting ( non-trivial, implicit , previously unknown and potentially useful) information or patterns from data in large databases
Background image of page 4
Lecture 1 - 5 DSCI 4520/5240 DATA MINING Motivation: “Necessity is the Mother of Invention” Data explosion problem Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories We are drowning in data, but starving for knowledge! Solution: Data warehousing and data mining Data warehousing and on-line analytical processing Extraction of interesting knowledge (rules, regularities, patterns, constraints) from data in large databases
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture 1 - 6 DSCI 4520/5240 DATA MINING electronic point-of-sale data hospital patient registries catalog orders    bank transactions remote sensing images    tax returns airline reservations    credit card charges stock trades   OLTP   telephone calls Data Deluge
Background image of page 6
DSCI 4520/5240 DATA MINING Data Mining, circa 1963   IBM 7090   600 cases “Machine storage limitations
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 30

DSCI4520_DMIntro_1 - DSCI 4520/5240 DATA MINING DSCI...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online