8.3.
ESTIMATION OF GARCH(P,Q) MODELS
199
of
s
summands such that the numbers of summands
s
is increasing with the
sample size.(see Hall and Yao (2003)).
We then merge all parameters of the model as follows:
φ
= (
c, φ
1
, . . . , φ
r
)
0
,
α
= (
α
0
, α
1
, . . . , α
p
)
0
and
β
= (
β
1
, . . . , β
q
)
0
.
For a given realization
x
=
(
x
1
, x
2
, . . . , x
T
) the
likelihood function
conditional on
x
, L(
φ, α, β

x
), is de
fined as
L(
φ, α, β

x
) =
f
(
x
1
, x
2
, . . . , x
s

1
)
T
Y
t
=
s
f
(
x
t
X
t

1
)
where in
X
t

1
the random variables are replaced by their realizations. The
likelihood function can be seen as the probability of observing the data at
hand given the values for the parameters.
The method of maximum like
lihood then consist in choosing the parameters (
φ, α, β
) such that the like
lihood function is maximized.
Thus we chose the parameter so that the
probability of observing the data is maximized. In this way we obtain the
maximum likelihood estimator. Taking the first
s
realizations as given de
terministic starting values, we then get the
conditional likelihood function
.
In practice we do not maximize the likelihood function but the logarithm
of it where we take
f
(
x
1
, . . . , x
s

1
) as a fixed constant which can be neglected
in the optimization:
log L(
φ, α, β

x
) =
T
X
t
=
s
log
f
(
x
t
X
t
)
=

T
2
log 2
π

1
2
T
X
t
=
s
log
σ
2
t

1
2
T
X
t
=
s
z
2
t
σ
2
t
where
z
t
=
x
t

c

φ
1
x
t

1

. . .

φ
r
x
t

r
denotes the realization of
Z
t
. The
maximum likelihood estimator is obtained by maximizing the likelihood func
tion over the admissible parameter space. Usually, the implementation of the
stationarity condition and the condition for the existence of the fourth mo
ment turns out to be difficult and cumbersome so that often these conditions
are neglected and only checked in retrospect or some ad hoc solutions are en
visaged. It can be shown that the (conditional) maximum likelihood estima
tor leads to asymptotically normally distributed estimates.
13
The maximum
likelihood estimator remains meaningful even when
{
ν
t
}
is not normally dis
tributed. In this case the quasi maximum likelihood estimator is obtained
(see Hall and Yao (2003) and Fan and Yao (2003)).
For numerical reasons it is often convenient to treat the mean equation
and the variance equation separately.
As the mean equation is a simple
13
Jensen and Rahbek (2004) showed that, at least for the GARCH(1,1) case, the sta
tionarity condition is not necessary.