Chapter 4
Generalized Least Squares Theory
In Section 3.6 we have seen that the classical conditions need not hold in practice. Although
these conditions have no eect on the OLS method per se, they do aect the properties of
the OLS estimators and resultin
Chapter 4
Generalized Least Squares
Theory
In Section 3.6 we have seen that the classical conditions need not hold in practice.
Although these conditions have no eect on the OLS method per se, they do aect the
properties of the OLS estimators and resultin
Chapter 8
Nonlinear Least Squares Theory
What we have analyzed so far are the OLS estimator for linear specications. Yet, it
is hard to believe that linear specications are universal in characterizing all economic
relationships. As an alternative, it is n
Chapter 7
Asymptotic Least Squares
Theory: Part II
In the preceding chapter the asymptotic properties of the OLS estimator were derived
under standard regularity conditions that require data to obey suitable LLN and
CLT. Some important consequences of the
Chapter 9
Quasi-Maximum Likelihood
Theory
As discussed in preceding chapters, postulating a (non-)linear specication and estimating
its unknown parameters by the least squares method amounts to approximating the conditional mean function of the dependent
Chapter 10
Quasi-Maximum Likelihood:
Applications
10.1
Binary Choice Models
In many economic applications the dependent variables of interest may assume only nitely
many integer values, each labeling a category of data. The simplest case of discrete depen
Chapter 9
Quasi-Maximum Likelihood
Theory
As discussed in preceding chapters, postulating a (non-)linear specication and estimating its unknown parameters by the least squares method amounts to approximating
the conditional mean function of the dependent
Chapter 8
Nonlinear Least Squares Theory
For real world data, it is hard to believe that linear specications are universal in
characterizing all economic relationships. A straightforward extension of linear specications is to consider specications that ar
Chapter 10
Quasi-Maximum Likelihood:
Applications
10.1
Binary Choice Models
In many economic applications the dependent variables of interest may assume only
nitely many integer values, each labeling a category of data. The simplest case of
discrete depen
Lecture Notes 1
Brief Review of Basic Probability
(Casella and Berger Chapters 1-4)
1
Probability Review
Chapters 1-4 are a review. I will assume you have read and understood Chapters
1-4. Let us recall some of the key ideas.
1.1
Random Variables
A random
Lecture Notes 2
1
Probability Inequalities
Inequalities are useful for bounding quantities that might otherwise be hard to compute.
They will also be used in the theory of convergence.
Theorem 1 (The Gaussian Tail Inequality) Let X N (0, 1). Then
P(|X | >
Lecture Notes 4
1
Random Samples
Let X1 , . . . , Xn F . A statistic is any function T = g (X1 , . . . , Xn ). Recall that the sample
mean is
n
1
Xn =
Xi
n i=1
and sample variance is
2
Sn
1
=
n1
n
(Xi X n )2 .
i=1
2
Let = E(Xi ) and = Var(Xi ). Recall tha
Lecture Notes 5
1
Statistical Models
A statistical model P is a collection of probability distributions (or a collection of densities). An example of a nonparametric model is
P=
p:
(p (x)2 dx < .
A parametric model has the form
P=
p ( x; ) :
2 /2
where R
Lecture Notes 7
1
Parametric Point Estimation
X1 , . . . , Xn p(x; ). Want to estimate = (1 , . . . , k ). An estimator
= n = w(X1 , . . . , Xn )
is a function of the data.
Methods:
1. Method of Moments (MOM)
2. Maximum likelihood (MLE)
3. Bayesian estim
Lecture Notes 6
1
The Likelihood Function
Denition. Let X n = (X1 , , Xn ) have joint density p(xn ; ) = p(x1 , . . . , xn ; ) where
. The likelihood function L : [0, ) is dened by
L() L(; xn ) = p(xn ; )
where xn is xed and varies in .
1. The likelihood
Lecture Notes 8
1
Minimax Theory
Suppose we want to estimate a parameter using data X n = (X1 , . . . , Xn ). What is the
best possible estimator = (X1 , . . . , Xn ) of ? Minimax theory provides a framework for
answering this question.
1.1
Introduction
L
Lecture Notes 3
1
Uniform Bounds
n
i=1
Recall that, if X1 , . . . , Xn Bernoulli(p) and pn = n1
inequality,
2
P(|pn p| > ) 2e2n .
Xi then, from Hoedings
Sometimes we want to say more than this.
Example 1 Suppose that X1 , . . . , Xn have cdf F . Let
1
Fn
Lecture Notes 9
Asymptotic (Large Sample) Theory
1
Review of o, O, etc.
1. an = o(1) mean an 0 as n .
P
2. A random sequence An is op (1) if An 0 as n .
P
3. A random sequence An is op (bn ) if An /bn 0 as n .
P
4. np op (1) = op (np ), so n op (1/ n) = o
Lecture Notes 17
Three Bonus Topics
1
Multiple Testing and Condence Intervals
Suppose we need to test many null hypotheses
H0,1 , . . . , H0,N
where N could be very large. We cannot simply test each hypotheses at level because, if
N is large, we are sure
Lecture Notes 13
Plug-In Estimators and The Bootstrap
This is mostly not in the text.
1
Introduction
Can we estimate the mean of a distribution without using a parametric model? Yes. The
key idea is to rst estimate the distribution function nonparametrica
Lecture Notes 14
Bayesian Inference
Relevant material is scattered throughout the book: see sections 7.2.3, 8.2.2, 9.2.4 and 9.3.3.
We will also cover some material that is not in the book.
1
Introduction
So far we have been using frequentist (or classica
Lecture Notes 16
Model Selection
Not in the text.
1
Introduction
Sometimes we have a set of possible models and we want to choose the best model. Model
selection methods help us choose a good model. Here are some examples.
Example 1 Suppose you use a poly
Lecture Notes 15
Prediction
This is mostly not in the text. Some relevant material is in Chapters 11 and
12.
1
Introduction
We observe training data (X1 , Y1 ), . . . , (Xn , Yn ). Given a new pair (X, Y ) we want to predict
Y from X . There are two commo