This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Data Mining Data mining emerged in the 1980s when the amount of data generated and stored became overwhelming. Data mining is strongly influenced by other disciplines such as mathematics, statistics, artificial intelligence, data visualization, etc. One of the difficulties with it being a new area is that the terminology is not fixed; the same concept may have different names when used in different applications. We first see how Data Mining compares with other areas. Remember that we are using the working definition: Data mining is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data . (W. Frawley). Data Mining vs Statistics Statistics can be viewed as a mathematical science and practice of developing knowledge through the use of empirical data expressed in quantitative form. Statistics allows us to discuss randomness and uncertainty via probability theory. For example, statisticians might determine the covariance of two variables to see if these variables vary together and measure the strength of the relationship. But data mining strives to characterize this dependency on a conceptual level and produce a causal explanation and a qualitative description of the data. Although data mining uses ideas from statistics it is definitely a different area. Data Mining vs Machine Learning Machine Learning is a subfield of artificial intelligence which emerged in the 1960s with the objective to design and develop algorithms and techniques that implement various types of learning. It has applications in areas as diverse as robot locomotion, medical diagnosis, computer vision, handwriting recognition, etc. The basic idea is to develop a learning system for a concept based on a set of examples provided by the teacher and any background knowledge. Main types are supervised and unsupervised learning (and modifications of these). Machine Learning has influenced data mining but the areas are quite different. Dr. Barbu in the Statistics Department offers a course in Machine Learning. Data Mining vs Knowledge Discovery from Databases (KDD) The concept of KDD emerged in the late 1980s and it refers to the broad process of finding knowledge in data. Early on, KDD and Data Mining were used interchangeably but now Data Mining is probably viewed in a broader sense than KDD. Data Mining vs Predictive Analytics Wikipedias definition is predictive analytics encompasses a variety of techniques from statistics, data mining and game theory that analyze current and historical facts to make predictions about future events. The core of predictive analytics relies on capturing relationships between explanatory variables and the predicted variables from past occurrences and exploiting it to predict future outcomes. One aspect of Data Mining is predictive analytics. Stages of Data Mining 1. Data gathering, e.g., data warehousing, web crawling 2. Data cleansing  eliminate errors and/or bogus data, e.g., patient fever =2....
View
Full
Document
This note was uploaded on 01/15/2012 for the course ISC 5315 taught by Professor Staff during the Spring '11 term at FSU.
 Spring '11
 staff

Click to edit the document details