This preview shows page 1. Sign up to view the full content.
Unformatted text preview: 4.27) Dividing both sides of (4.27) by ∏n and taking the limit of both sides of (4.27) as n → 1,
the right hand side goes to 0, completing the proof.
Note that for a stochastic matrix [P ] > 0, this corollary simpliﬁes to limn→1 [P ]n = e π .
n
This means that limn→1 Pij = πj , which means that the probability of being in state j
after a long time is πj , independent of the starting state.
Theorem 4.7. Let [P ] be the transition matrix of an ergodic ﬁnitestate Markov chain.
Then ∏ = 1 is the largest real eigenvalue of [P ], and ∏ > µ for every other eigenvalue
µ. Furthermore, limn→1 [P ]n = eπ , where π > 0 is the unique probability vector satisfying
π [P ] = π and e = (1, 1, . . . , 1)T is the unique vector ∫ (within a scale factor) satisfying
[P ]∫ = ∫ .
Proof: From corollary 4.3, ∏ = 1 is the largest real eigenvalue of [P ], e is the unique
(within a scale factor) right eigenvector of ∏ = 1, and there is a unique probability vector
π such that π [P ] = π . From Theorem 4.4, [P ]m is positive for suﬃciently large m. Since
[P ]m is also stochastic, ∏ = 1 is strictly larger than the magnitude of any other eigenvalue
of [P ]m . Let µ be any other eigenvalue of [P ] and let x be a right eigenvector of µ. Note
that x is also a right eigenvector of [P ]m with eigenvalue (µ)m . Since ∏ = 1 is the only
eigenvalue of [P ]m of magnitude 1 or more, we either have µ < ∏ or (µ)m = ∏. If (µ)m = ∏,
then x must be a scalar times e . This is impossible, since x cannot be an eigenvector of
[P ] with both eigenvalue ∏ and µ. Thus µ < ∏. Similarly, π > 0 is the unique left
eigenvector of [P ]m with eigenvalue ∏ = 1, and π e = 1. Corollary 4.6 then asserts that 156 CHAPTER 4. FINITESTATE MARKOV CHAINS limn→1 [P ]mn = e π . Multiplying by [P ]i for any i, 1 ≤ i < m, we get limn→1 [P ]mn+i = e π ,
so limn→1 [P ]n = e π .
Theorem 4.7 generalizes easily to an ergodic unichain (see Exercise 4.15). In this case, as
one might suspect, πi = 0 for each transient state i and πi > 0 within the ergodic class.
Theorem 4.7 becomes:
Theorem 4.8. Let [P ] be the transition matrix of an ergodic unichain. Then ∏ = 1 is the
largest real eigenvalue of [P ], and ∏ > µ for every other eigenvalue µ. Furthermore,
lim [P ]m = eπ , m→1 (4.28) where π ≥ 0 is the unique probability vector satisfying π [P ] = π and e = (1, 1, . . . , 1)T is
the unique ∫ (within a scale factor) satisfying [P ]∫ = ∫ .
If a chain has a periodic recurrent class, [P ]m never converges. The existence of a unique
probability vector solution to π [P ] = π for a periodic recurrent chain is somewhat mystifying
at ﬁrst. If the period is d, then the steadystate vector π assigns probability 1/d to each
of the d subsets of Theorem 4.3. If the initial probabilities for the chain are chosen as
Pr {X0 = i} = πi for each i, then for each subsequent time n, Pr {Xn = i} = πi . What
is happening is that this initial probability assignment starts the chain in each of the d
subsets with probability 1/d, and subsequent transitions maintain this randomness over
n
subsets. On the other hand, [P ]n cannot converge because Pii , for each i, is zero except
when n is a multiple of d. Thus the memory of starting state never dies out. An ergodic
Markov chain does not have this peculiar property, and the memory of the starting state
dies out (from Theorem 4.7).
The intuition to be associated with the word ergodic is that of a process in which timeaverages are equal to ensembleaverages. Using the general deﬁnition of ergodicity (which
is beyond our scope here), a periodic recurrent Markov chain in steadystate (i.e., with
Pr {Xn = i} = πi for all n and i) is ergodic.
Thus the notion of ergodicity for Markov chains is slightly diﬀerent than that in the general theory. The diﬀerence is that we think of a Markov chain as being speciﬁed without
specifying the initial state distribution, and thus diﬀerent initial state distributions really
correspond to diﬀerent stochastic processes. If a periodic Markov chain starts in steady
state, then the corresponding stochastic process is stationary, and otherwise not. 4.5 Markov chains with rewards Suppose that each state i in a Markov chain is associated with some reward, ri . As the
Markov chain proceeds from state to state, there is an associated sequence of rewards that
are not independent, but are related by the statistics of the Markov chain. The situation
is similar to, but simpler than, that of renewalreward processes. As with renewalreward
processes, the reward ri could equally well be a cost or an arbitrary real valued function of
the state. In this section, the expected value of the aggregate reward over time is analyzed. 4.5. MARKOV CHAINS WITH REWARDS 157 The model of Markov chains with rewards is surprisingly broad. We have already seen that
almost any stochastic process can be approximated by a Markov chain. Also, as we saw
in studying renewal theory, the concept of rewards is quite graphic not only in modeling
such things as corporate proﬁts or portfolio performan...
View
Full
Document
This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.
 Spring '09
 R.Srikant

Click to edit the document details