This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Review of Probability and Statistics 1 . Random Variables Suppose we are to perform a random experiment whose outcome cannot be predicted in
advance. The set of all possible outcomes of the random experiment is called the sample space. 1. If the experiment is tossing a die,
.9: {1.7.1,9, r, is]
2. If the experiment is ﬂipping two coins simultaneously,
SI. f HH. TH, H T, T1) 3. If the experiment is observing whether you are awake at the end of the lecture, S‘: i A'Luolu'] Aolf’er) A random variable is a function that maps the sample space to real numbers. The value of a
random variable is determined by the outcome of the random experiment. We will typically denote the random variables with capital letters (eg. X, Y, Z X‘
1. >8: 1 S d our l." rt)me 0 i4 nc'l’COMClJ W...
2. )3; Hair his: r {l Mam .‘3 HT, TH a 'A Q l 2 ii— ovtcmi’ :‘J TT 3. o if: we Del“Me {J aglrtf I ll: IllC Owl‘b‘me L) awake Discrete random variables — The random variables that we looked at in the examples
above can take only discrete values. Hence, they are called discrete random variables. Take the random variable X _ {1 if you are awake at the end of the lecture 0 otherwise. Assume that we observe your state after 71 lectures. Let X"; be deﬁned as A 1 if you are awake at the end of the i—th lecture
0 otherwise. Construct a histogram for {X5 2 i = 1,. . . ,n} by computing p0 = [no. of times we observe outcome 0]/n
p1(n) = [no. of times we observe outcome AS 71 tends to inﬁnity,
folﬂluw—P P:(°l——7 The function p(:c) = P{X = .13} is called the probability mass function of the random variable
X. For this example, the function might look like PM}
33 _ , _ ,
73 w w ° l The probability law governing a random variable X can also be described by
X Fag): M884»): XE MW”): 2%) izOO We have the following properties for the probability mass function and the cumulative distri— bution function: cu
‘ [HM 7/0 for all X  Xgi—m {{lel
. _. :4 «Ur» PM”?
— 7/b 4°, _ Jill:16) Xe,”
lo to OH
__ ngxxggs Elm): E 'oU)_. Ere]: HM" Ha")
‘lzo "(aw iEI
Note that knowing is the same as knowing because
x xv\
Mm): pm— Z Plil= Wle FUN)
#4“ 1‘4? Continuous random variables — Consider the random variable
X = The distance between your upper and lower eyelids at the end of the lecture. The values that X can take lie in a continuum. Hence, it is called a continuous random
variable. Assume that we observe the state of your eyelids after 71 lectures. Let X. be the measurement at the end of the i—th lecture. Construct a histogram for : 2‘ = 1, . . . ,n},
which might look like
Fr ecl mm :3
70
1‘7 1') 7n VH0 532 7’6’ Va 5/”: 75/3 As 71 tends to inﬁnityJ with the width of the intervals converging to 0 at an appropriate
rate, the histogram converges to a function f which might look like 1M Va The function is called the probability density function of the random variable X. It
not true that f : lP{X : In factJ for any value $, But,foragb1 La Y
but will allu wolf
MGEXSbl= ff!“ 4): q LDWeM: J?“ C“ This is essentially the area underneath the probability density function over the interval [11,13]. The cumulative distribution function is deﬁned as K
F(w)=IP{Xs.m}= [Mus < ><\ £701; {imaw ~00
\
If f is continuous in an interval containing m, then (vi : F ( x)
The probability density and cumulative distribution function of a continuous random vari—
able satisfy on
1... incijrro far all}  f ileéxxll
can
a. ‘0
. Fly) 770 in" a“ x  Aw Fix): 1 Aw FIX]
Km) =4“
x » e
t \a 01 \ (
m7 [Moi €57; S‘fixldXL Siixiéx ' jthldx: Fﬂb)” ii")
6! ' w m Emample e The probability density function of the random variable X is given by f(m)={l/4$3 if0g$£2 0 otherwise.
Then, 0 x S O
)4
FCC): K 9 ' 05! xi],
Efﬂuent: jl‘tg‘m‘ ELL:— =’ L "if
“a; Li lb fb
0 I)
i 31 KW”; (a 1" Ha)”
JP{1.0§X§1.5}= FU‘H' HIIOH s 2. Summary Measures of a Random Variable Expected value (mean, average) — If X is a discrete random variable, then 90
IEX = Z x. pm
 00
If is a real—valued function, then
en
1E{9(X)} = 9”" W
03 If X is a contincialous random variable, then jib ﬁx] (ix . m
If is a real—valued function, then }E{g(X)}= j‘mgm we) eat For any two random variables X and Y, we have
89 y xlnl 3%.: e. N EX: e. LPN) .e—
a. O}
max}: feudal“ a,
.9n IE{aX+bY}: EWM L. EN) 0. Emample e The probability density function of the random variable X is given by 1/4m3 ifOSwﬁZ ﬂm) : {0 otherwise.
Then, 2 Z
JE{X2}— lxl L WK ‘ if = f
0 9 2L: 3
o E)? [WrinM er I: dpl'ifMJAr’il’ic',‘ Variance e In general, we have m {WWW} e
mm 2, El UNA?) + (MN)
13mm" 2.11%]: \EW‘?+ (WW1 Va'r(X) : _ H
H Ezcample e The probability density function of the random variable X is given by m): {1/4333 iroggsg2 0 otherwise.
Th ,
en 2 6 1,, m2 L7 '
Var(X): 3* (E) w H. 7??)
1m“): V3 1
1 L x3 M t l" 3 z *8"
0 3. Independence Roughly speaking, the random variables X and Y are independent if the knowledge of one of
them tells us nothing about the value of the other. One measure of independence between two random variables is the covariance, which is
computed as Cov(X,Y)= E i (x Em) Cr» EH“) 2 EE’] XT Y LEW]  Y. eixhtm'lil‘ﬂ}
: lﬁfxﬂw trm‘ my [Hmirwh remrm
: [Ema It‘m tH‘r] If X and Y are independent, then their covariance is 0. However, the two random variables
can have covariance 0, but still be dependent. We have Var(aX+bY)= fJ12 \{M(\[l+ 5:1 VVfiY) 'ilOLlo CW (3?in
When X and Y are independent
Var(a.X+ bY) = 111 Va'r LY) 4 L1 V“ (Y) 4. Normal Distribution A continuous random variable X is normally distributed with mean ,u, and variance 02, if its probability density function is ﬁx): 1 exp [—% (m_ﬂ)z], —oo<a:<oo. 0271' 0 (We will use N (a, 02) to denote a normally distributed random variable with mean u and
variance 02.) This probability density function looks like iffxi This probability density function is symmetric around the mean pi. Thus, if X N NW, 0'2),
then Past—a}: WWW“) F—a t4 r111
mN(l\41lTI)Uo{‘" bluffs"! “16.1) If X N N01,, 02), then
aX+b~ NU:th «Lu—7“) ’3 l X_ ' _ _/ U".L N N(oli)
“N fotp N<Lll LZ'r’ #1 If X N NW1, 0%), Y N NW2, 0%), and X and Y are independent then X+Y~ NLtMktnl {7(1+ 6'11) 7 If X N NW, 02), then x4” c xM ’1 t Le Mans :51 Therefore, knowing the cumulative distribution function for N(0, 1) is enough to deduce the
cumulative distribution function for any normally distributed random variable. Letting Z N
N (0, 1), lP’{Z g m} is tabulated for different values of a: in the appendix of your textbook. 5. Sums and Averages of Independent Random Variables Suppose that X1, X2, . . . are independent random variables that are uniformly distributed over
the interval [0, 1]. Then, EX]L = 1/2, Vadle : 1/12. The probability density function of
X1 looks like ﬁX‘L (“l The probability density functions of X1 + X2 and (X 1 + X2) / 2 look like
News (:< ) “WWW” z .wm FlintW (“1 if i Z ﬁlil'fwz (Kl:
‘T '2 )(lrr Y“{X) g [£in )(‘z 92x1]: Fx.+)a7(2") n.. Z k I iv 3%” \Ct ( K1 1
‘T’ The probability density functions of X1 + X2 + X3 and (X 1 + X2 + X3) / 3 look like at Flow“: ((1%): 2‘ fauna") 0 ' i From these plots, we can make some qualitative observations about the sum of 'n, indepen— dent random variables. These observations hold in far greater generality than this example.
1. The distribution of the average is just a re—scaling of the distribution of the sum. 2. Asnincreases,_+he dandy at. M «merry "gar: “f” 9" flat m’dcilﬂ \(unobéw‘j 9L ’M SUM irtrfawx)
Midway 3. As n increases, _+1M
4. As 1?. 1noreases,_ “A VOW béu M 0; TM 0%,“), We can make the last two statements more concrete. Let X1,X2, . .. be independent and
identically distributed (i.i.d.) random variables with ﬁnite mean and variance. Deﬁne SH=X1+...+X,, _ l
XR=E[X1+...+X,,]. We call X], as the sample mean. Then, Va’r(3n)= Veg()(ld. yt4..+ YA): in VW‘(%1) mm W + \M. \— MN VGT(X91)= \{u’( [Eda 361*“1’Yf‘D 3 iﬂL m Roughly speaking, the sum of n i.i.d. random variables is n times as variable than any one of
the random variables, whereas the average of n i.i.d. random variables is 1/n times as variable
as any one of the random variables. This discussion is related to the law of large numbers and central limit theorem. We have seen that the density of the sample mean Xn starts to cluster
around the true mean lEX]L as n grows. The law of large numbers formalizes this notion. Law of large numbers — Let X1,X2, . .. be a sequence of i.i.d. random variables with
lEX1 < 00. Then, “almost always” 1 ‘71
— 2X, —> EX1
n 12:1 Elﬁn—>00. Note that for ﬁnite n, X}, = £232, X, is still a random variable, whereas lEXl is always 10 a deterministic quantity. It is a natural question to ask how much the sample mean Xn
differs from EX1 when n is ﬁnite. An answer to this question is provided by the central limit
theorem. In particular, consider the deviation between XE and lEX1 given by X.“ — lEXl. The
expectation of the deviation is EU?” — lEXl} 2 IE{XH} — lEXI : 0. On the other hand, the variance of the deviation is .L  lewl) VarU—(n m IEX1)= V”'( f“) 3 h The central limit theorem tells us how the deviation between X}. and EX1 behaves as it gets large. Central limit theorem m Let X1,X2, . .. be a sequence of i.i.d. random variables with
variance Var(X1) = 0’2 and < 00. Then, 11 'i
Xn—IEXI:£ZXi—1EX11> N (O; 1L”)
nézl m as 'n, —> 00, where —D+ stands for convergence in distribution. This implies that, for large n, ” 9 Mom, 2:1 X”: M
K
22 1
n p—' Sn=ixig n. >3“ :E} n‘ N(LL{\Cl, 32.15:. N (n. lin; net) The implications i'of this result are Smml’lt Wm" 15‘ “Pyroﬁiwetttj urmalb cl;£‘l'h‘bul"(cl AFFWWMDHDA all»; lociftp a; {he Sample fly ﬂ ln((f0JC.9. The central limit theorem is often used to provide conﬁdence intervals for EX 1. Suppose
that we want a 95% conﬁdence interval for EXl. We know from the central limit theorem that 2
.X’n—nxl % N(O,U—).
TL Hom the tables that give the cumulative probability distribution of N [0, 1), we know that M—lﬁlo £N(0,1):1M }=0.95. Therefore, 10 11 .‘it. 11:1 : 5
__ at 2; g ‘7 (Moll) g l 0% =7
M 1' w 1/; W)
g} <:‘ Mi .2; 1 0.5T?)
[MW E NM; “5 WI] MA This gives approximate 95% conﬁdence interval for IEXl. This conﬁdence interval is only
approximate because_n i5 #inilﬁ  I
It becomes exact when X1,X2,... are_r’\urirmli‘\J (Wdtﬂb VWWMW ‘h‘emgflv‘; [if] bu In practice, one is usually interested in conﬁdence intervals because lEX1 is not known.
But, the conﬁdence interval above requires knowledge of _U— 3 \{ 0r [ 36')
In practice, 02 is estimated by its sample estimator 'n. 7 g'TDEV Adv/Ho)
33;: 1 gag—X”)? (
“*1 5x22 , 32a (
S‘nl’ f V'
(Whydowe divide byn—l, not it?) ﬂ'l In general, to produce an approximate 100(1 — 0:)% conﬁdence interval for EX, 1. Select a sample size n.
2. Generate n i.i.d. samples X1,X2, . . . ,Xﬂ of X.
3. Compute the estimators 1 TL , 1 n 2
Xn=ﬁiz=§Xh s""’='n,—1 4. Look up the value of 204/2 such that
IP{—Zq/2 S S 2042} = 1 — a. 5. The approximate 100(1 — 00% conﬁdence interval for lEX is given by — s
X11 $ Zea/2 TL This random interval will include IBEX approximately 100(1 — oz)% of the time. It is important to understand the difference between the conﬁdence interval for the mean
and the quantiles of a random variable. Suppose that X is a random variable with probability
density function 11 12 The p—th quantile of X is the value q such that FM) 2 IP{X S q} = P. For example, we can
select 91 and Q'g so that P{QI 3 X g (12} = 0.95. But [q1, q2] is not a 95% conﬁdence interval for EX. 12 13 ...
View
Full Document
 '08
 TOPALOGLU

Click to edit the document details