These are stored in data base like patterns see wilks

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: es are used among other things for the processing of text. Natural Language Processing (NLP). The goal of information extraction methods is the extraction of specific information from text documents. These are stored in data base-like patterns (see Wilks (1997)) and are then available for further use. For further details see section 3.3. In the following, we will frequently refer to the above mentioned related areas of research. We will especially provide examples for the use of machine learning methods in information extraction and information retrieval. Information Extraction (IE). 2 Text Encoding For mining large document collections it is necessary to pre-process the text documents and store the information in a data structure, which is more appropriate for further processing than a plain text file. Even though, meanwhile several methods exist that try to exploit also the syntactic structure and semantics of text, most text mining approaches are based on the idea that a text document can be represented by a set of words, i.e. a text document is described based on the set of words contained in it (bag-of-words representation). However, in 24 LDV-FORUM A Brief Survey of...
View Full Document

This note was uploaded on 06/19/2011 for the course IT 2258 taught by Professor Aymenali during the Summer '11 term at Abu Dhabi University.

Ask a homework question - tutors are online