Chapter Seven
In the following multiple choice questions, circle the correct answer.
1.
The set of all elements of interest in a study is
a.
set notation
b.
a set of interest
c.
a sample
d.
a population
e.
None of the above answers is correct.
ANSWER: d
2

Convergence Concepts Denition 5.8 Suppose that X1, X2, . . . is a sequence of random variables. We say that this sequence converges in distribution to a random variable X if
n
lim P (Xn
x) = P (X
x)
at all points x at which FX (x) = P (X
x) is continuous.

However, if FX is constant on some interval, then F X 1 is not well dened by (2). The problem is
avoided by dening FX 1 (y ) for 0 < y < 1 by
FX 1 (y ) = inf cfw_x : FX (x) y .
(3)
At the end point of the range of y , FX 1 (1) = if FX (x) < 1 for all x a

Transformations and Expectations
1
Distributions of Functions of a Random Variable
If X is a random variable with cdf FX (x), then any function of X , say g (X ), is also a random
variable. Sine Y = g (X ) is a function of X , we can describe the probabil

1.6. Density and Mass Functions
Denition 1.6.1 (Probability Mass Function) The probability mass function (pmf) of a discrete random variable X is given by fX (x) = P (X = x) for all x.
Example 1.6.2 (Geometric probabilities) For the geometric distribution

1.4 Random Variable
Motivation example In an opinion poll, we might decide to ask 50 people whether they agree
or disagree with a certain issue. If we record a 1 for agree and 0 for disagree, the sample
space for this experiment has 250 elements. If we de

1.3 Conditional Probability and Independence
All of the probabilities that we have dealt with thus far have been unconditional probabilities.
A sample space was dened and all probabilities were calculated with respect to that sample
space. In many instanc

1.2.3 Counting and Equally Likely Outcomes
Methods of counting are often used in order to construct probability assignments on nite
sample spaces, although they can be used to answer other questions also. The following
theorem is sometimes known as the Fu

Lecture 2 : Basics of Probability Theory
When an experiment is performed, the realization of the experiment is an outcome in the sample
space. If the experiment is performed a number of times, dierent outcomes may occur each time
or some outcomes may repe

Theorem 2.1 Let X be a random variable and let a, b, and c be constants. Then for any functions
g1 (x) and g2 (x) whose expectations exist,
a. E (ag1 (X ) + bg2 (X ) + c) = aEg1 (X ) + bEg2 (X ) + c.
b. If g1 (x) 0 for all x, then Eg1 (X ) 0.
c. If g1 (x)

2.3 Moment Generating Function
Theorem 2.3.11 Let FX (x) and FY (y) be two cdfs all of whose moments exist. a. If X and Y have bounded supports, then FX (u) = FY (u) for all u if and only if EX r = EY r for all integers r = 0, 1, 2, . . . b. If the moment

3.2.3 Binomial Distribution
The binomial distribution is based on the idea of a Bernoulli trial. A Bernoulli trail is
an experiment with two, and only two, possible outcomes. A random variable X has a
Bernoulli(p) distribution if
X=
1 with probability p
0

3.2.4 Poisson Distribution
Denition Let X be the number of events per basic unit: For example, Number of rain drops in one minute. Number of cars passing by you for an hour. Number of chocolate particles in one ChoCoChip cookie. Number of typos in one pag

3.2.5 Negative Binomial Distribution
In a sequence of independent Bernoulli(p) trials, let the random variable X denote the trial
at which the rth success occurs, where r is a xed integer. Then
P (X = x|r, p) =
x1 r
p (1 p)xr ,
r1
x = r, r + 1, . . . ,
(1

3.3 Continuous Distribution
3.3.1 Uniform Distribution
The continuous uniform distribution is dened by spreading mass uniformly over an interval [a, b]. Its pdf given by f (x|a, b) =
b a
1 ba
if x [a, b] otherwise
0
It is easy to check that
f (x)dx = 1

3.3.3 Normal Distribution
The normal distribution has several advantages over the other distributions. a. The normal distribution and distributions associated with it are very tractable and analytically. b. The normal distribution has the familiar bell sh

3.3.4 Beta Distribution
The beta(, ) pdf is
f (x|, ) =
1
x1 (1 x) 1 ,
B (, )
0 < x < 1,
> 0,
> 0,
where B (, ) denotes the beta function,
1
B (, ) =
x1 (1 x) 1 dx =
0
()( )
.
( + )
For n > , we have
1
1
xn x1 (1 x) 1 dx
B (, ) 0
B ( + n, )
( + n)( + )
=

STAT 743
Foundations of Statistics (Term 1) Angelo J. Canty
Oce : Phone : E-mail :
Hamilton Hall 209 (905) 525-9140 extn 27079 cantya@mcmaster.ca
Web Page : www.math.mcmaster.ca/canty/teaching/stat743
1. Probability
Denition 1.1 A random experiment is a p

Lecture 1: Set Theory
1
Set Theory
One of the main objectives of a statistician is to draw conclusions about a population of objects by
conducting an experiment. The st step in this endeavor is to identify the possible outcomes or, in
statistical terminol

Sufficient Statistics
L ecture XIX
Data Reduction
References:
Casella, G. and R.L. Berger Statistical Inference 2nd
Edition, New Y ork: Duxbury Press, Chapter 6
Principles of Data Reduction. Pp 271-309.
Hogg, R.V., A. Craig, and J.W. McKean Introduction t

DISJOINT VS INDEPENDENT
It is important not to confuse the concept of disjoint with independent. Here are several
thoughts to keep in mind. Some are referenced from other publications and websites.
When you think about disjoint events, you should be think

STAT743 FOUNDATIONS OF STATISTICS
Assignment 1
Due October 16, 2007
Q. 1
a) Casella and Berger exercise 1.11 b) If A and B are two events in a sigma-algebra, prove, using only the Axioms of probability as a starting point, that P (A B) P (A) P (A B) P (A)

STAT743 FOUNDATIONS OF STATISTICS
Assignment 2
Due November 13, 2007
Q. 1
a) Derive the F probability density function by considering the ratio of two independent chi-squared random variables each divided by their respective degrees of freedom. b) Derive

STAT743 FOUNDATIONS OF STATISTICS
Assignment 3
Due at 4pm on Monday December 3, 2007
Q. 1 A persons blood type can be one of four types (O, A, B or AB). Each parent passes on a gene for O, A or B which combine to give the childs blood type. The A and B ge

Hint for 6.2 Firstly, derive the joint density of (X1 , , Xn ). Here they are independently but not identically. The density of Xi is fXi (xi ) = eixi I(xi > i) = eixi I(xi /i > ) where I is the indicator function. The joint pdf of (X1 , , Xn ) is
n n
fX1

STAT883
Spring, 2006
Homework 2
Due Thursday, 9 February 2006
From Casella & Burger: 6.30, 7.6, 7.9, 7.10, 7.11
Find the method of moment estimator (MME) and the maximum likelihood estimator (MLE) of
based on a random sample X1 , , Xn from each of the

Sufficient, Complete and Ancillary Statistics
1 of 7
http:/www.math.uah.edu/stat/point/Sufficient.xhtml
Virtual Laboratories > 7. Point Estimation > 1 2 3 4 5 6
6. Sufficient, Complete, and Ancillary Statistics
The Basic Statistical Model
Consider again t