This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ics has its grounds in mathematics and deals with the science and
practice for the analysis of empirical data. It is based on statistical theory
which is a branch of applied mathematics. Within statistical theory, randomness
and uncertainty are modelled by probability theory. Today many methods of
statistics are used in the ﬁeld of KDD. Good overviews are given in Hastie et al.
(2001); Berthold & Hand (1999); Maitra (2002).
1.3 Deﬁnition of Text Mining Text mining or knowledge discovery from text (KDT) — for the ﬁrst time
mentioned in Feldman & Dagan (1995) — deals with the machine supported
analysis of text. It uses techniques from information retrieval, information
extraction as well as natural language processing (NLP) and connects them
with the algorithms and methods of KDD, data mining, machine learning
and statistics. Thus, one selects a similar procedure as with the KDD process,
whereby not data in general, but text documents are in focus of the analysis.
From this, new questions for the used data mining m...
View Full Document
This note was uploaded on 06/19/2011 for the course IT 2258 taught by Professor Aymenali during the Summer '11 term at Abu Dhabi University.
- Summer '11