Probability and Statistics II
M-312
Lecture 1
1. Random Samples
1.1. Statistics and Their Distributions
The observations in a single sample are denoted by
n
x
x
x
,
,
,
2
1
. Before we obtain data
there is uncertainty about the values of each
x
i
. Because of this uncertainty, before the
data becomes available we view each observation as a random variable and denote the
sample by
n
X
X
X
,
,
,
2
1
. This variation in observed values in turn implies that the value
of any function of the sample observations – such as sample mean, sample standard
deviation, etc.
– also varies from sample to sample. That is, prior to obtaining
n
x
x
x
,
,
,
2
1
there is uncertainty as to the value of
x
, the value of
s
, and so on.
Definition 1.1.
A
statistic
is any quantity whose value can be calculated from sample
data. Prior to obtaining data, there is uncertainty as to what value of any particular
statistic will result. Therefore, a statistic is a random variable and will be denoted by an
uppercase letter; a lowercase letter is used to represent the calculated or observed value of
the statistic.
Definition 1.2.
The rv’s
n
X
X
X
,
,
,
2
1
are said to form a (simple)
random sample
of
size
n
if
1.
The
X
i
’s are independent rv’s.
2.
Every
X
i
has the same probability distribution.
Conditions 1 and 2 can be paraphrased by saying that the
X
i
’s are independent and
identically distributed (iid). If sampling is either with replacement or from an infinite
(conceptually) population, Conditions 1 and 2 are satisfied exactly. These conditions will
be approximately satisfied if sampling is without replacement, yet the sample size
n
is
much smaller than the population size
N
. In practice, if
05
.
0
≤
N
n
(at most 5% of the
population is sampled), we can proceed as if the
X
i
’s form a random sample.
There are two general methods for obtaining information about a statistic’s sampling
distribution. One method involves calculations based on probability rules, and the other
involves carrying out a simulation experiment.