1
Econ 513, Fall 2005, USC Department of Economics
Lecture 7: Maximum Likelihood Estimation: Basics and Likelihood Functions
1
Setup
When we looked at the linear regression model,
y
i
=
x
i
β
+
ε
i
,
with
ε
i

x
i
∼ N
(0
, σ
2
), we focused on least squares estimation:
ˆ
β
= arg min
β
n
i
=1
(
y
i

x
i
β
)
2
,
leading to the estimator
ˆ
β
=
n
i
=1
x
i
x
i

1
n
i
=1
x
i
y
i
.
We can motivate this estimator in a different way, namely as a
maximum likelihood estimator
,
or mle:
ˆ
β
= arg max
β,σ
2
L
(
β, σ
2
)
,
where
L
(
β, σ
2
) =
n
i
=1

1
2
ln(2
πσ
2
)

1
2
σ
2
(
y
i

x
i
β
)
2
.
Note that the pdf of
y
i
given
x
i
is
f
(
y
i

x
i
;
β, σ
2
) =
1
√
2
πσ
2
e

(
y
i

x
i
β
)
2
2
σ
2
This leads to the same estimator for
β
(why?), and to
ˆ
σ
2
=
1
n
n
i
=1
(
y
i

x
i
ˆ
β
)
2
.
This approach is more general, allowing us to deal with more complex nonlinear models. We
will first look at the construction of the likelihood function itself in the setting of a particular
model under various sampling schemes.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
2
In general the likelihood function is the joint density of the data viewed as a function
of the parameters. Suppose we have independent and identically distributed random vari
ables
z
i
, . . . , z
n
, with common density
f
(
z, θ
). Then the likelihood function given a sample
z
1
, . . . , z
n
is
L
(
θ
) =
n
i
=1
f
(
z
i
θ
)
.
Its logarithm is referred to as the log likelihood function:
L
(
θ
) = ln
L
(
θ
) =
n
i
=1
ln
f
(
z
i
, θ
)
.
2
Example: duration model
Lancaster (1979) is interested in determining “the causes of variation between unemployed
persons in the length of time they are out of work
....
bearing as it does upon the design and
effect of welfare policy.” He has data on unemployment durations of 479 unskilled workers,
as well as some of their individual characteristics such as age, the local unemployment rate
and the replacement ratio, measured as “how much they had coming in from all these sources
(unemployment benefit, supplementary benefit, and family income supplement) during the
main period of their unemployment”, divided by the answer to the question “how much did
you earn, after deductions, in your last job.” Especially the coefficient on the last variable
is viewed as relevant for social policy.
The economic theory underlying Lancaster’s analysis is job search theory. An unemployed
individual is assumed to receive job offers, arriving according to some rate
λ
(
t
), such that
the expected number of job offers arriving in a short interval of length
dt
is
λ
(
t
)
dt
. Each offer
consists of some wage rate
w
, drawn independently of previous wages, from some distribution
with distribution function
F
(
w
). The offer is compared to some reservation wage ¯
w
(
t
), and
if the offer is better than the reservation wage, that is with probability 1

F
( ¯
w
(
t
)), the offer
is accepted. The reservation wage is set to maximize utility. Suppose that the arrival rate
is constant over time. In that case the optimal reservation wage is also constant over time,
and the probability of receiving an acceptable offer in a short period of time
dt
is
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '07
 Rashidian
 Economics, Econometrics, Normal Distribution, Probability theory, probability density function, Likelihood function, mle

Click to edit the document details