CIT590 KWIC

Next for each file 1 create a dictionary the keys

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: r enters an empty string. Next, for each file, 1. Create a dictionary; the keys will be keywords, and the associated values will be a list of line numbers on which the keyword occurs. 2. Read in the file as a list of lines. 3. For each line, find every keyword, and add the keyword + line number to the dictionary. Don't change the original line, but make a copy of it. In the copy, lowercase all words and remove all digits and all punctuation except apostrophes ('. For simplicity, we will consider ) every apostrophe as part of a word, even if it is used to quote something. If the (lowercased) keyword is not already in the dictionary, add it as key, with a list containing the line number as value. If the keyword is in the dictionary, add the line number to the associated list of line numbers. 4. Write (or append) to a local file named k i _ n e . x : wcidxtt 1. The name of the input file (just once), and www.cis.upenn.edu/~ matuszek/cit590- 2013/Assig nments/05- kwic.html 1/2 2/10/13 CIT590 KWIC 2. For each keyword found in the file, for each line on which the keyword occurs, print the line...
View Full Document

This document was uploaded on 02/02/2014.

Ask a homework question - tutors are online