Population size means
Sample is a subset of y-values
means sample size
possible subsets
Probability of a given sample is
If you have one value fixed and they ask what the probability is that you are
in the sample you calculate
For
is
which is
for All
SRS is when everyone has an equal chance of being in the sample.
Second order inclusion probabilities: (Doing second order is what gives you
standard errors)
Total: Sum of all values in the population; Total: ;
Mean:
Second order inclusion probability is the probability that
appears
with
Classical Infinite population sampling- fixed samples with random values
that are random vs. finite population sampling- fixed values with random
samples
Every type of sampling has to give everyone a chance to be in the sample
When estimating a population total, you can use the Horvitz-Thompson
Estimator:
SRS:
HTE is always unbiased
is the probability that i is in the sample
is the probability that I and j are in the sample
To get the standard error, you need a , so you can get
Stratified mean:
with
as the sample mean
is the finite population correction (the error)
(fpc) meaning the closer n is to
N, the less of an error there is
HTE is good because it lets us think about statistical abnormality criteria and
we need information about normality
Frame: a list of identifiers that allow us to draw the sample
;
PPSWR= probability proportional to size with replacement
PPSWOR= probability proportional to size without replacement
Find probabilities of possible samples being in it
There is a different probability of getting AD and DA
can occur as AB or BA so
Therefore,
For a sample of
then the HTE of , meaning
can be both quantitative and qualitative but it needs to be binary for
qualitative
and
is an estimate of
and
is an estimate of
The Harvis Thompson estimate
of
is
is the weight of
SRS:
is the inflation factor
for estimator:
1.
Theoretically calculate the variance of what you are estimating
2.
Find empirical estimate
3.
Take square root:
Reasons for stratification: practicality, guaranteed representation, increases
precision and reduces variance
Principle of stratifying- you could get the perfect answer if you could stratify
into homogenous groups
Two Stage Cluster Sampling:
N= total number of PSUs
n= amount being tested of PSUs
is how many are being sampled of each unit
number of SSUs in ith PSU
i.e. k= total number of PSU’s
To get you calculate
for each M and y and then you add them up and
multiply by N/n seen below
HTE of
multiply by 1/n-1 to get
is the sample mean within the
PSU
Variance:
Standard Error:
Where
and
and
is the sample variance within the
sample PSU.
Standard Error is the square root of variance. The optimal allocation of one is
mean per PSU:
and mean per SSU:
z is the assumption of asymptotic normality
95% Confidence Intervals
SRS Estimate for mean:
The confidence interval of 95% does not mean that we are 95% sure that this
is the mean but that we are 95% sure the mean is in the interval.

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This is the end of the preview.
Sign up
to
access the rest of the document.