{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

CIT590 KWIC - CIT590 KWIC CIT 590 Assignm ent 5 KWIC Index...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
2/10/13 CIT590 KWIC www.cis.upenn.edu/~matuszek/cit590-2013/Assignments/05-kwic.html 1/2 CIT 590 Assignment 5: KWIC Index Spring 2013, David Matuszek Purposes of this assignment Give you some experience with file I/O. Give you more experience with text manipulation. General idea of the assignment A KWIC (Key Word In Context) index is an old, pre-digital way of looking things up, somewhat similar to a biblical concordance. The basic idea is that there are two kinds of words in English: stop words , which do not convey any information about the content of an article (for example, "the,", "and", "of"), and keywords , which are basically everything else. Your program will read in text files, and write out (to file) a KWIC index of all the keywords that it finds. Details You should provide two (or more) files, kwic.py and kwic_test.py . Your program should start by reading in a list of stop words from a file named stop_words.txt (provided), in the same directory as your program.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}