This preview shows page 1. Sign up to view the full content.
Unformatted text preview: es are used among other things for the processing of text.
Natural Language Processing (NLP). The goal of information extraction methods is the
extraction of speciﬁc information from text documents. These are stored in data
base-like patterns (see Wilks (1997)) and are then available for further use. For
further details see section 3.3.
In the following, we will frequently refer to the above mentioned related areas
of research. We will especially provide examples for the use of machine learning
methods in information extraction and information retrieval.
Information Extraction (IE). 2 Text Encoding For mining large document collections it is necessary to pre-process the text documents and store the information in a data structure, which is more appropriate
for further processing than a plain text ﬁle. Even though, meanwhile several
methods exist that try to exploit also the syntactic structure and semantics of
text, most text mining approaches are based on the idea that a text document
can be represented by a set of words, i.e. a text document is described based
on the set of words contained in it (bag-of-words representation). However, in 24 LDV-FORUM A Brief Survey of...
View Full Document
This note was uploaded on 06/19/2011 for the course IT 2258 taught by Professor Aymenali during the Summer '11 term at Abu Dhabi University.
- Summer '11