Lecture 8: Expectations
The expected value, sometimes it is also called the expectation or
mean, of a random variable is its average value weighted
according to its probability distribution.
The expected value is a number that summarizes a typical, middle
Lecture 5: Random variables and their distributions
A random variable is a function from S into a set of real numbers (a
subset of R).
It is convenient to deal with a summary variable than the original
The range of a random v
Lecture 4: Independence
Two events A and B are (statistically) independent iff
P(A B) = P(A)P(B).
This denition is still good if P(A) = 0 or P(B) = 0.
A and B are independent iff P(A|B) = P(A) (or P(B|A) = P(B).
That is, the probability of
Lecture 3: Conditional probability
Sometimes we want to know the probability of event A given that
another event B has occurred.
If A and B are events with P(B) > 0, then the conditional probability of
A given B is
Lecture 2: Measurable space, measure and
A collection F of subsets of a sample space S is called a -eld (or
-algebra) iff it has the following properties:
(i) The empty set 0 F ;
(ii) If A F , then the complement Ac F ;
Chapter 1: Probability Theory
Lecture 1: Sample space and subsets
An experiment (in a general sense) that results in one of more than
one outcomes with uncertainty in which outcome will be the realization
Examples of random experiments
Lecture 19: Conditional distribution and expectation
Denition 4.2.1 (conditional pmf)
Let (X1 , ., Xn ) be a discrete random vector with joint pmf f (x) and k be
an integer satisfying 1 k n 1. The conditional pmf of (Xk +1 , ., Xn )
given that (X1 , ., Xk
Lecture 18: Multivariate expectation and covariance
The expectation of a random vector X = (X1 , ., Xn ) is dened as
E (X ) = (E (X1 ), ., E (Xn ), provided that E (Xi ) exists for any i.
When M is a matrix whose (i, j)th element
Lecture 6: Density and mass functions
The cdfs can be used to calculate various probabilities related to
It may be more convenient to use another function to calculate
The probability mass function (pmf) of
Chapter 2: Transformations and Expectations
Lecture 7: Transformations
For a random variable X , we are often interested in a transformation,
Y = g(X ), which is also a random variable.
Here g is a function from the space for X to a new space.
For an even
Stat 609 (Fall 2014)
Homework Assignment 1
Due in lecture on Friday, Sept 19, 2014
All problems are from the textbook:
1.3, 1.11, 1.13, 1.15,1.18, 1.24, 1.33, 1.36,
1.37, 1.38, 1.46, 1.47, 1.52, 1.53, 1.54
Homework Assignment 2
Due in lecture on Wed, Oct
TA: Cuicui Qi
Oce Hour: 9 AM - 10:30 AM, TR
Detailed Solution for Dis 1: Ex.4 and Ex.2.(b)
4. A man possesses ve coins, two of which are double-headed, one is double-tailed, and two are
normal. He shuts his eyes, picks a coin at random, and tos
Lecture 11: Differentiating under an integral sign
When can we switch the differentiation and integration?
If the range of the integral is nite, this switch is usually valid.
Theorem 2.4.1 (Leibnitzs rule)
If f (x, ), a( ), and b( ) are differentiable wit
Lecture 9: Moments and moment generating functions
The various moments of a random variable are an important class of
For each integer n, the nth moment of a random variable X (or FX ) is
n = E (X n ),
= 1 .
The nth central
Lecture 10: Characterizing distributions
Can the moments determine a distribution?
Can two random variables with different distributions have the same
moments of any order?
f1 (x) =
X1 has pdf
X2 has pdf
1 e (logx) /2 ,
f2 (x) = f1
Lecture 17: Joint distributions and mgfs
Bivariate normal distribution
The bivariate normal distribution has the following pdf on R 2 :
(x 1 )
exp 2 2 (1 2 ) + (x1 )(y 22 ) 2 22 2 )
f (x, y) =
(x, y) R 2
where 1 R, 2 R,
Lecture 14: Exponential and location-scale families
Families of Distributions
In statistics we are interested in some families of distributions, i.e.,
some collections of distributions.
For example, the family of binomial distributions with p (0, 1) and a
Lecture 35: Minimal sufciency
Maximal reduction without loss of information
There are many sufcient statistics for a given problem.
In fact, X (the whole data set) is sufcient.
If T is a sufcient statistic and T = (S), where is a function
and S is another
Lecture 37: Three principles
For data reduction, we consider three principles:
Let X be a sample from a population indexed by .
If T (X ) is sufcient for , then any
Lecture 21: Sum of random variables
In applications, we frequently consider a sum of random variables or,
in many cases, a sum of independent random variables.
The pdf of the sum of two random variables (convolution)
Let X and Y be random variables having
Lecture 22: Multivariate transformation
In Chapter 2, we considered a single transformation of a single random
We now consider a vector of transformations of a random vector.
The pdf of a multivariate transformation
Let X be a k-dimensional rand
Chapter 6. Principles of Data Reduction
Lecture 34: Sufciency
We consider a sample X = (X1 , ., Xn ), n > 1, from a population of
interest (each Xi may be a vector and X may not be a random sample,
although most of the time we consider a ra
Lecture 29: Convergence concepts
In statistical analysis or inference, a key to the success of nding a
good procedure is being able to nd some moments and/or
distributions of various statistics.
In many complicated problems we are not
Lecture 31: Central Limit Theorem
The Central Limit Theorem (CLT) is one of the most important
theorems in probability and statistics.
It derives the limiting distribution of a sequence of normalized random
Theorem 5.5.15 (Central Limit
Lecture 33: Generating a random sample
In this lecture, we study how to generate random variables from a
This may be useful in applications, or in statistical research when we
carry out simulation studies, or to approximate integrals t