Lecture Notes on Probability

4 19 sectionsummary expectedvalue

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: we assume that p(S) = ½. 16 Bayesian Spam Filters using Multiple Words Example: We have 2000 spam messages and 1000 non‐spam messages. The word “stock” occurs 400 times in the spam messages and 60 times in the non‐spam. The word “undervalued” occurs in 200 spam messages and 25 non‐spam. Solution: p(stock) = 400/2000 = .2, q(stock) = 60/1000=.06, p(undervalued) = 200/2000 = .1, q(undervalued) = 25/1000 = .025 If our threshold is .9, we class the message as spam and reject it. 17 Bayesian Spam Filters using Multiple Words In general, the more words we consider, the more accurate the spam filter. With the independence assumption if we consider k words: We can further improve the filter by considering pairs of words as a single block or certain types of strings. 18 Section 6.4 19 Section Summary Expected Value Linearity of Expectations Average‐Case Computational Complexity Geometric Distribution Independent Random Variables Variance Chebyshev’s Inequality 20 Expected Value Definition: The expected value (or expectation or mean) of the random variable X(s) on the sample space S is equal to Example‐Expected Value of a Die: Let X be the number that comes up when a fair die is rolled. What is the expected value of X? Solution: The random variable X takes the values 1, 2, 3, 4, 5, or 6. Each has probability 1/6. It follows that 21 Expected Value Theorem 1: If X is a random variable and p(X = r) is the probability that X = r, so that then Proof: Suppose that X is a random variable with range X(S) and let p(X = r) be the probability that X takes...
View Full Document

Ask a homework question - tutors are online