Homework 3 Solution - CS-5340/6340 Solutions to Written Assignment#3

CS-5340/6340, Solutions to Written Assignment #3 1. (28 pts) This question is about the Basilisk algorithm for semantic lexicon induction. Use the following seed words for the animal and human semantic classes: animal: bird, cat, dog, rat, snake human: boy, girl, person, man, student Consider the following contextual patterns paired with the words that they extract (i.e., co-occur with) in an imaginary text corpus: Pattern Words SUBJ climbed bear, boy, cat, cougar, monkey, squirrel SUBJ ran bear, boy, cat, deer, dog, girl, man, mouse, squirrel, woman SUBJ ate bird, boy, cat, dog, girl, man, owl, snake, woman SUBJ flew bat, bird, canary, finch, hawk, owl, parrot, sparrow SUBJ nested bird, finch, hawk, owl, parrot, sparrow, squirrel admired DOBJ cougar, eagle, mother, father, hero caught DOBJ bird, boy, cold, mouse, rat, snake, sparrow, squirrel hunted DOBJ bear, deer, cougar treed DOBJ bear, cougar invited DOBJ boy, daughter, girl, lady, man, son, student, woman praised DOBJ boy, daughter, dog, girl, son, student scared by NP bear, cougar, dog, person, rat, shark, snake, spider, thunder cage for NP bird, canary, finch, parrot, rat, snake wings of NP bat, bird, finch, hawk, owl, parrot, sparrow For the RlogF score, assume that the logarithm always returns a value of at least 1. That is, RlogF = F i N i * log 2 ( F i ), UNLESS F=0 or F=1 in which case RlogF = F i N i . (a) (8 pts) Using the seed words above, compute Basilisk’s RlogF score for the animal class for each pattern below. SUBJ ran RlogF = 2/10 * log(2) caught DOBJ RlogF = 3/8 * log(3) invited DOBJ RlogF = 0/8 * log(0) 1
cage for NP RlogF = 2/6 * log(2) (b) (8 pts) Using the seed words above, compute Basilisk’s RlogF score for the human class for each pattern below. SUBJ ran RlogF = 3/10 * log(3) caught DOBJ RlogF = 1/8 * log(1) invited DOBJ RlogF = 4/8 * log(4) cage for NP RlogF = 0/6 * log(0) (c) (6 pts) Assume that all patterns shown in the table are in Basilisk’s pattern pool. Com- pute the AvgLog score for the following words for the animal class. Use the true log 2 value for this computation. Please show your work! cougar AvgLog = (log(1+1) + log(0+1) + log(3+1) + log(0+1) + log(0+1))/5 = (1+0+2+0+0)/5 = 3/5 = .60 squirrel AvgLog = (log(1+1) + log(2+1) + log(1+1) + log(3+1))/4 = (1+1.585+1+2)/4 = 1.396 (d) (6 pts) Assume that all patterns shown in the table are in Basilisk’s pattern pool. Com- pute the AvgLog score for the following words for the human class. Use the true log 2 value for this computation.

