This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Expectation of geometric distribution What is the probability that X is finite? Σ ∞ k =1 f X ( k ) = Σ ∞ k =1 (1 p ) k 1 p = p Σ ∞ j =0 (1 p ) j = p 1 1 (1 p ) = 1 Can now compute E ( X ): E ( X ) = Σ ∞ k =1 k · (1 p ) k 1 p = p " Σ ∞ k =1 (1 p ) k 1 + Σ ∞ k =2 (1 p ) k 1 + Σ ∞ k =3 (1 p ) k 1 + ··· # = p [(1 /p ) + (1 p ) /p + (1 p ) 2 /p + ··· ] = 1 + (1 p ) + (1 p ) 2 + ··· = 1 /p So, for example, if the success probability p is 1 / 3, it will take on average 3 trials to get a success. • All this computation for a result that was intuitively clear all along . . . 1 Variance and Standard Deviation Expectation summarizes a lot of information about a ran dom variable as a single number. But no single number can tell it all. Compare these two distributions: • Distribution 1: Pr(49) = Pr(51) = 1 / 4; Pr(50) = 1 / 2 . • Distribution 2: Pr(0) = Pr(50) = Pr(100) = 1 / 3. Both have the same expectation: 50. But the first is much less “dispersed” than the second. We want a measure of dispersion . • One measure of dispersion is how far things are from the mean, on average. Given a random variable X , ( X ( s ) E ( X )) 2 measures how far the value of s is from the mean value (the expec tation) of X . Define the variance of X to be Var(X) = E((X E(X)) 2 ) = Σ s ∈ S Pr(s)(X(s) E(X)) 2 The standard deviation of X is σ X = r Var(X) = r Σ s ∈ S Pr( s )( X ( s ) E ( X )) 2 2 Why not use  X ( s ) E ( X )  as the measure of distance instead of variance? • ( X ( s ) E ( X )) 2 turns out to have nicer mathematical properties. • In R n , the distance between ( x 1 , . . . , x n ) and ( y 1 , . . . , y n ) is r ( x 1 y 1 ) 2 + ··· + ( x n y n ) 2 Example: • The variance of distribution 1 is 1 4 (51 50) 2 + 1 2 (50 50) 2 + 1 4 (49 50) 2 = 1 2 • The variance of distribution 2 is 1 3 (100 50) 2 + 1 3 (50 50) 2 + 1 3 (0 50) 2 = 5000 3 Expectation and variance are two ways of compactly de scribing a distribution. • They don’t completely describe the distribution • But they’re still useful! 3 Variance: Examples Let X be Bernoulli, with probability p of success. Recall that E ( X ) = p . Var(X) = (0 p ) 2 · (1 p ) + (1 p ) 2 · p = p (1 p )[ p + (1 p )] = p (1 p ) Theorem: Var(X) = E(X 2 ) E(X) 2 . Proof: E (( X E ( X )) 2 ) = E ( X 2 2 E ( X ) X + E ( X ) 2 ) = E ( X 2 ) 2 E ( X ) E ( X ) + E ( E ( X ) 2 ) = E ( X 2 ) 2 E ( X ) 2 + E ( X ) 2 = E ( X 2 ) E ( X ) 2 Think of this as E (( X c ) 2 ), then substitute E ( X ) for c ....
View
Full
Document
This note was uploaded on 05/21/2009 for the course CS 2800 taught by Professor Selman during the Spring '07 term at Cornell.
 Spring '07
 SELMAN

Click to edit the document details