Lecture15-openvocab2

Lecture15-openvocab2 - Thisworkislicensed under Alike3.0Unported License CS479,section1 Announcements ReadingReport#6 M&Sch.7 Due:Wednesday

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
10/5/2011 1 CS 479, section 1: Natural Language Processing Lecture #15: Good Turing Smoothing (cont.) , Open Vocabulary (2) This work is licensed under a Creative Commons Attribution Share Alike 3.0 Unported License . Announcements Reading Report #6 M&S ch. 7 Due: Wednesday Project #1, Part 2 Help Session: Tuesday at 4pm in 1066 TMCB Early: Friday Due: next Monday Questions? Objectives Fix problems with Good Turing smoothing Prepare to use these techniques in Project #1, Part 2. Plots Start with Zipf: Problem #1 (for Large k) 1 10 100 1000 10000 100000 1 10 100 1000 N_k k ݎ ൌ ሺ ݇ ൅ 1 ⋅ ܰ ௞ାଵ ൌሺ ݇൅1 ⋅ܰ Solution #1: Power Law Fit Simple Good Turing [Gale and Sampson]: Fit a monotone decreasing, non zero function ܰ ݇ ൎܽ݇ ܾ (b < 1) Use the value at the fitted function instead of N k for unreliable k Common choice: beyond some k (the first one that is non decreasing or zero) N 1 N 2 N 1 N 2 N 3
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10/5/2011 2 Power Law Fit Take the log: log ܰ ݇ ൎlog ܽ൅ܾ l o g ݇ Therefore, we can do linear regression on the log values Note: Requires renormalization of the distribution afterwards re compute the “new” number of tokens N 1 N 2 N 1 N 2 N 3 ܰ ݇ ൎ ܽ݇ ܾ Problem #2 (for Large k) y = 0.8763x + 6.1212 0 1 2 3 4 5 6 7 8 9 10 024681 0 1 2 log(N_k) log(k) Account for Zeros Solution: Averaging Transform (Gale) y = 1.8064x + 10.456 10 5 0 5 10 15 0 1 2 log(Z_k) k 0 F, or such that 2 k k k next prev kN Z N kk Solutions Actually applied in this order: 1. Averaging transform 2. Power law fit: a, b 3. Use power law beyond some threshold K 4. Now use Good Turing!
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 10/18/2011 for the course CS 479 taught by Professor Ericringger during the Fall '11 term at BYU.

Page1 / 5

Lecture15-openvocab2 - Thisworkislicensed under Alike3.0Unported License CS479,section1 Announcements ReadingReport#6 M&Sch.7 Due:Wednesday

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online