lect7-probability.ppt

# lect7-probability.ppt - P r o b a b ilit y L e c tu re # 7...

This preview shows pages 1–8. Sign up to view the full content.

Andrew McCallum, UMass Amherst Probability Lecture #7 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Andrew McCallum, UMass Amherst Today’s Main Points • Remember (or learn) about probability theory – samples, events, tables, counting – Bayes’ Rule, and its application – A little calculus? – random variables – Bernoulli and Multinomial distributions: the work- horses of Computational Linguistics. – Multinomial distributions from Shakespeare.
Andrew McCallum, UMass Amherst Probability Theory • Probability theory deals with predicting how likely it is that something will happen. – Toss 3 coins, how likely is it that all come up heads? – See phrase “more lies ahead”, how likely is it that “lies” is noun? – See “Nigerian minister of defense” in email, how likely is it that the email is spam? – See “Le chien est noir”, how likely is it that the correct translation is “The dog is black”?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Andrew McCallum, UMass Amherst Probability and CompLing • Probability is the backbone of modern computational linguistics. .. because: – Language is ambiguous – Need to integrate evidence • Simple example (which we will revisit later) – I see the first word of a news article: “glacier” – What is the probability the language is French? English? – Now I see the second word: “melange”. – Now what are the probabilities?
Andrew McCallum, UMass Amherst Experiments and Sample Spaces Experiment (or trial ) – repeatable process by which observations are made – e.g. tossing 3 coins • Observe basic outcome from sample space , Ω , (set of all possible basic outcomes), e.g. – one coin toss, sample space Ω = { H, T }; basic outcome = H or T – three coin tosses, Ω = { HHH, HHT, HTH,…, TTT } – Part-of-speech of a word, Ω = { CC 1 , CD 2 , CT 3 , …, WRB 36 } – lottery tickets, | Ω | = 10 7 – next word in Shakespeare play, | Ω | = size of vocabulary – number of words in your Ph.D. thesis Ω = { 0, 1, … } – length of time of “a” sounds when I said “sample”. discrete, countably inﬁnite continuous, uncountably inﬁnite

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Andrew McCallum, UMass Amherst Events and Event Spaces • An event, A, is a set of basic outcomes, i.e., a subset of the sample space, Ω . – Intuitively, a question you could ask about an outcome. Ω = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} – e.g. basic outcome = THH – e.g. event = “has exactly 2 H’s”, A={THH, HHT, HTH} – A= Ω is the certain event, A= is the impossible event. – For “not A”, we write A • A common event space, F, is the power set of the sample space, Ω . (power set is written 2 Ω ) – Intuitively: all possible questions you could ask about a basic outcome.
Andrew McCallum, UMass Amherst Probability • A probability is a number between 0 and 1.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## lect7-probability.ppt - P r o b a b ilit y L e c tu re # 7...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online