§
3
Likelihood and Sufficiency
§
3.1
Introduction
3.1.1
Sample data:
X
= (
X
1
, . . . , X
n
)
Postulated parametric family of probability functions (i.e. either mass functions or pdf’s) :
{
p
(
· 
θ
) :
θ
∈
Θ
}
e.g. —
•
X
1
, . . . , X
n
are iid Poisson(
θ
):
p
(
x
1
, . . . , x
n

θ
) =
n
Y
i
=1
P
(
X
i
=
x
i

θ
) =
n
Y
i
=1
e

θ
θ
x
i
x
i
!
,
θ
∈
(0
,
∞
)
.
•
X
1
, . . . , X
n
are iid
N
(
μ, σ
2
):
p
(
x
1
, . . . , x
n

θ
) =
n
Y
i
=1
•
1
√
2
πσ
2
exp
‰

(
x
i

μ
)
2
2
σ
2
‚
θ
= (
μ, σ
)
∈
(
∞
,
∞
)
×
(0
,
∞
)
.
If we observe
X
=
x
= (
x
1
, . . . , x
n
), what can we learn from
x
about the true value of
θ
?
We wish to answer this question by means of
statistical inference
.
3.1.2
Raw sample data, i.e.
x
, contain information relevant to our inference about
θ
. We want to
extract such information completely yet “economically”. This amounts to finding an efficient
way to “summarize” data.
Answer:
Likelihood function
— a mathematical device summarizing all information available in
x
which is relevant to
θ
. It
measures the plausibility of each
θ
∈
Θ being the true
θ
that gives rise to
x
.
3.1.3
Suppose the random vector
X
has a probability function belonging to the parametric family
{
p
(
· 
θ
) :
θ
∈
Θ
}
.
Definition.
Given that
X
is observed (realised) to be
x
, the
likelihood function
of
θ
is defined
to be
‘
x
(
θ
) =
p
(
x

θ
)
,
i.e. the probability function of
X
, evaluated at
X
=
x
, but considered as a function of
θ
.
17