Copyright c
±
2010 by Karl Sigman
1 Coupling from the past for Markov chains
Given a discretetime Markov chain (MC)
{
X
n
:
n
≥
0
}
, with state space
S
(assumed
here to be discrete), and transition matrix
P
= (
P
ij
) that is known to have a unique
stationary (limiting) distribution
π
= (
π
j
)
j
∈S
, it is of interest to be able to simulate
copies of rvs
X
distributed as
π
. (We have in mind that a priori
π
is unknown, and that
solving for it is not possible or would entail an enormous amount of time/computation.)
Assuming one can simulate the MC stepbystep, we could simulate it up to some
large time
n
and estimate the value of
π
j
(for a given ﬁxed
j
∈ S
) by using the estimate
π
j
(
n
) =
1
n
n
X
i
=1
I
{
X
i
=
j
}
,
the proportion of visits to state
j
out of these
n
time units. Doing this for all
j
∈ S
would
then yield an estimate ˆ
π
(
n
) for
π
. Then we could use the inversetransform method on
ˆ
π
(
n
) to generate a copy
ˆ
X
distributed exactly as ˆ
π
(
n
) and hence approximately as
π
.
This yields only an approximation to what we want, and moreover, it is far from clear
how “close” our approximation is; not only does ˆ
π
(
n
) depend on
n
, but it also depends
on the particular sample path of the MC that was used to construct it (including the
initial state,
X
0
, that was chosen).
The purpose of these notes is to introduce the reader to a simulation method that
yields a copy of
X
distributed exactly as
π
. This is referred to as
exact
or
perfect
simulation. The method here is called
coupling from the past
, because as we shall see, it
requires (conceptually) constructing the MC from the inﬁnite past up to time 0, as is done
when utilizing Loyne’s Lemma (Lemma 1 in [3]) in the context of monotone stochastic
recursions. The method was introduced in the paper [4], with earlier examples and
fundamental results concerning exact simulation presented in [2] (where the limitations
of such methods (in general) are laid out too). A more recent exposition with simple
examples is in [1], Chapter 4, Section 8.
1.1 The framework
Assume that
S
=
{
0
,
1
,...,b
}
is ﬁnite (some
b
≥
1), and for each
i
∈ S
let
Y
(
i
) denote
a rv distributed as the
i
th
row of
P
:
P
(
Y
(
i
) =
j
) =
P
(
X
n
+1
=
j

X
n
=
i
) =
P
ij
, j
∈ S
.
Y
(
i
) is distributed as “the next state visited if currently the chain is in state
i
”.
Let
Y
= (
Y
(0)
,Y
(1)
,...,Y
(
b
)), where for now we do not specify the joint distribution
of this random vector, only the marginal distributions. Let
{
Y
n
:
n
≥
1
}
denote an iid
sequence of such random vectors;
Y
n
= (
Y
n
(0)
,Y
n
(1)
,...,Y
n
(
b
)). A key point here for
future reference is that if at any time
n
it holds that
X
n
=
i
, then a copy of
X
n
+1
can be constructed by taking an independent copy of
Y
(denoted by
Y
n
) and deﬁning
X
n
+1
=
Y
n
(
i
); that is,
X
n
+1
=
Y
n
(
X
n
). Thus, given any initial value for
X
0
, chosen
independently of
{
Y
n
:
n
≥
1
}
, we can recursively construct the MC (forwards in time)