Copyright
c
2009 by Karl Sigman
1
Discretetime Markov chains
1.1
Stochastic processes in discrete time
A
stochastic process
in discrete time
n
∈
IN =
{
0
,
1
,
2
, . . .
}
is a sequence of random variables
(rvs)
X
0
, X
1
, X
2
, . . .
denoted by
X
=
{
X
n
:
n
≥
0
}
(or just
X
=
{
X
n
}
). We refer to the value
X
n
as the
state
of the process at time
n
, with
X
0
denoting the initial state.
If the random
variables take values in a discrete space such as the integers ZZ =
{
. . . ,

2
,

1
,
0
,
1
,
2
, . . .
}
(or
some subset of them), then the stochastic process is said to be discretevalued; we then denote
the states by
i, j
and so on. In general, however, the collection of possible values that the
X
n
can take on is called the
state space
, is denoted by
S
and could be, for example,
d

dimensional
Euclidean space IR
d
, d
≥
1, or a subset of it.
Stochastic processes are meant to model the evolution over time of real phenomena for
which randomness is inherent. For example,
X
n
could denote the price of a stock
n
days from
now, the population size of a given species after
n
years, the amount of bandwidth in use in
a telecommunications network after
n
hours of operation, or the amount of money that an
insurance risk company has right after it pays out its
n
th
claim. The insurance risk example
illustrates how “time”
n
need not really be time, but instead can be a sequential indexing of
some kind of events. Other such examples:
X
n
denotes the amount of water in a reservoir after
the
n
th
rain storm of the year,
X
n
denotes the amount of time that the
n
th
arrival to a hospital
must wait before being admitted, or
X
n
denotes the outcome (heads or tails) of the
n
th
flip of
a coin.
The main challenge in the stochastic modeling of something is in choosing a model that
has – on the one hand – enough complexity to capture the complexity of the phenomena
in question, but has – on the other hand – enough structure and simplicity to allow one to
compute things of interest. In the context of our examples given above, we may be interested
in computing
P
(
X
30
>
50) for a stock that we bought for
X
0
= $35 per share, or computing
the probability that the insurance risk company eventually gets ruined (runs out of money),
P
(
X
n
<
0
,
for some
n >
0), or computing the longrun average waiting time of arrivals to the
hospital
lim
N
→∞
1
N
N
X
n
=1
X
n
.
As a very simple example, consider the sequential tossing of a “fair” coin. We let
X
n
denote
the outcome of the
n
th
toss. We can take the
X
n
as
p
= 0
.
5 Bernoulli rvs,
P
(
X
n
= 0) =
P
(
X
n
=
1) = 0
.
5, with
X
n
= 1 denoting that the
n
th
flip landed heads, and
X
n
= 0 denoting that it
landed tails. We also would assume that the sequence of rvs are independent. This then yields
an example of an
independent and identically distributed
(iid) sequence of rvs. Such sequences
are easy to deal with for they are defined by a single distribution (in this case Bernoulli), and
are independent, hence lend themselves directly to powerful theorems in probability such as the
strong law of large numbers and the central limit theorem.