tagging_session

tagging_session - >>> line...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
Python 2.4.1 (#1, Aug 31 2005, 06:49:06) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 Type "copyright", "credits" or "license()" for more information. **************************************************************** Personal firewall software may warn about the connection IDLE makes to its subprocess using this computer's internal loopback interface. This connection is not visible on any external interface and no data is sent to or received from the Internet. **************************************************************** IDLE 1.1.1 >>> ================================ RESTART ================================ >>> fsock_train=open('data/really_tiny_train.tag','r',0)
Background image of page 1
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: >>> line = fsock_train.readline() >>> line 'FACTSHEET_NN1 WHAT_DTQ IS_VBZ AIDS_NN1 ?_? \n' >>> words = line.split() >>> words ['FACTSHEET_NN1', 'WHAT_DTQ', 'IS_VBZ', 'AIDS_NN1', '?_?'] >>> word_tag_pairs = >>> for word in words: pair=word.split('_') word_tag_pairs[0:0] = [pair] >>> word_tag_pairs [['?', '?'], ['AIDS', 'NN1'], ['IS', 'VBZ'], ['WHAT', 'DTQ'], ['FACTSHEET', 'NN1']] >>> tag_count={} >>> word_tag_matrix={} >>> for elem in words: ... wt_pair = elem.split('_') ... if len(wt_pair) ==2: ... (word,tag) = wt_pair ... word_tag_matrix[word,tag]=word_tag_matrix.get((word,tag),0)+1 ... tag_count[tag]=tag_count.get(tag,0)+1 ... >>>...
View Full Document

Ask a homework question - tutors are online