CS-5340/6340, Solutions to Written Assignment #2 DUE: Wednesday, September 21, 2016 by 11:00pm 1. (15 pts) The table below contains frequency values for a set of nouns referring to trees in an imaginary text corpus. Fill in the table below with the unsmoothed probability of each noun, as well as the smoothed frequency and smoothed probability of each noun using add-one smoothing. You should assume that the vocabulary consists only of the nouns listed below. IMPORTANT: Please show the fraction (numerator/denominator) used to com- pute each value as well as the final value (e.g., 2/4 = .50). NOUN FREQ UNSMOOTHED SMOOTHED SMOOTHED PROB FREQ PROB maple 600 600 1200 = .50 601 * 1200 1205 = 598.6 598 . 6 1200 = .499 oak 400 400 1200 = .33 401 * 1200 1205 = 399.4 399 . 4 1200 = .333 pine 180 180 1200 = .15 181 * 1200 1205 = 180.3 180 . 3 1200 = .150 spruce 20 20 1200 = .017 21 * 1200 1205 = 20.9 20 . 9 1200 = .0174 aspen 0 0 1200 = 0 1 * 1200 1205 = .996 . 996 1200 = .0008 1
2. (16 pts) Consider the three Noun Phrase (NP) grammars and the three recursive transition networks (RTNs) below: Grammar A Grammar B Grammar C NP art NP1 NP NP1 NP NP1 NP1 adj NP1 NP1 art NP2 NP1 art NP2 NP1 NP2 NP1 NP2 NP2 adj NP2 NP2 noun NP2 adj NP2 NP2 adj NP3 NP2 noun NP2 NP2 NP3 NP3 noun NP3 NP4 NP3 noun noun NP4 noun NP4 NP3 noun NP3 NP4 noun NP Z pop pop noun noun adj art adj X Y noun RTN-1 NP1 adj noun noun W pop pop V noun NP NP art X Y adj pop noun X Y NP1 V art RTN-2 NP1 Z pop noun adj Z pop W pop noun noun NP1 pop RTN-3 NP1 adj 2
Each grammar and RTN accepts a noun phrase “language” consisting of sequences of part- of-speech (POS) tags that are considered to be legal noun phrases. For example, “adj art noun” might be a POS tag sequence in a noun phrase. For each pair below, indicate whether they accept exactly the SAME NP language or DIF- FERENT NP languages (i.e., do they accept exactly the same set of POS tag sequences or not). If you answer

