ejsr_22_2_10 - European Journal of Scientific Research ISSN...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
European Journal of Scientific Research ISSN 1450-216X Vol.22 No.2 (2008), pp.241-250 © EuroJournals Publishing, Inc. 2008 http://www.eurojournals.com/ejsr.htm Knowledge Discovery in Online Repositories: A Text Mining Approach Fatudimu I.T Department of Computer and Information Science Covenant University, Ota, Nigeria E-mail: ibkfat@yahoo.co.uk Tel: +234-08052318494 Musa A.G Department of Computer and Information Science Covenant University, Ota, Nigeria E-mail: adebola.musa@covenantuniversity.com Ayo C.K Department of Computer and Information Science Covenant University, Ota, Nigeria E-mail: ckayome@yahoo.com Sofoluwe A. B Department of Computer Science,University of Lagos Lagos, Nigeria Email: absofoluwe@yahoo.com Abstract Before the advent of the Internet, the newspapers were the prominent instrument of mobilization for independence and political struggles. Since independence in Nigeria, the political class has adopted newspapers as a medium of Political Competition and Communication. Consequently, most political information exists in unstructured form and hence the need to tap into it using text mining algorithm. This paper implements a text mining algorithm on some unstructured data format in some newspapers. The algorithm involves the following natural language processing techniques: tokenization, text filtering and refinement. As a follow-up to the natural language techniques, association rule mining technique of data mining is used to extract knowledge using the Modified Generating Association Rules based on Weighting scheme (GARW). The main contributions of the technique are that it integrates information retrieval scheme (Term Frequency Inverse Document Frequency) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) with Data Mining technique for association rules discovery. The program is applied to Pre-Election information gotten from the website of the Nigerian Guardian newspaper. The extracted association rules contained important features and described the informative news included in the documents collection when related to the concluded 2007
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Knowledge Discovery in Online Repositories: A Text Mining Approach 242 presidential election. The system presented useful information that could help sanitize the polity as well as protect the nascent democracy. Keywords: Text Mining, Data Mining, Association Rule Mining, Inference, Politics 1.0 Introduction We have entered an era where very large amount of politically oriented text are now available online. This includes both official documents, such as the full text of laws and the proceedings of legislative bodies, and unofficial documents, such as postings on weblogs (blogs) devoted to politics [1]. Fortunately, there are many tools at our disposal to manage this outbreak of textual information, many of these tools are derived from earlier works in Information Retrieval (IR), Natural language processing, and statistics, Artificial intelligence (AI), Information Theory and Data Mining [2].
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 10

ejsr_22_2_10 - European Journal of Scientific Research ISSN...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online