Discrete-time stochastic processes

# This is a simple explicit expression for expected

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 4.27) Dividing both sides of (4.27) by ∏n and taking the limit of both sides of (4.27) as n → 1, the right hand side goes to 0, completing the proof. Note that for a stochastic matrix [P ] > 0, this corollary simpliﬁes to limn→1 [P ]n = e π . n This means that limn→1 Pij = πj , which means that the probability of being in state j after a long time is πj , independent of the starting state. Theorem 4.7. Let [P ] be the transition matrix of an ergodic ﬁnite-state Markov chain. Then ∏ = 1 is the largest real eigenvalue of [P ], and ∏ > |µ| for every other eigenvalue µ. Furthermore, limn→1 [P ]n = eπ , where π > 0 is the unique probability vector satisfying π [P ] = π and e = (1, 1, . . . , 1)T is the unique vector ∫ (within a scale factor) satisfying [P ]∫ = ∫ . Proof: From corollary 4.3, ∏ = 1 is the largest real eigenvalue of [P ], e is the unique (within a scale factor) right eigenvector of ∏ = 1, and there is a unique probability vector π such that π [P ] = π . From Theorem 4.4, [P ]m is positive for suﬃciently large m. Since [P ]m is also stochastic, ∏ = 1 is strictly larger than the magnitude of any other eigenvalue of [P ]m . Let µ be any other eigenvalue of [P ] and let x be a right eigenvector of µ. Note that x is also a right eigenvector of [P ]m with eigenvalue (µ)m . Since ∏ = 1 is the only eigenvalue of [P ]m of magnitude 1 or more, we either have |µ| < ∏ or (µ)m = ∏. If (µ)m = ∏, then x must be a scalar times e . This is impossible, since x cannot be an eigenvector of [P ] with both eigenvalue ∏ and µ. Thus |µ| < ∏. Similarly, π > 0 is the unique left eigenvector of [P ]m with eigenvalue ∏ = 1, and π e = 1. Corollary 4.6 then asserts that 156 CHAPTER 4. FINITE-STATE MARKOV CHAINS limn→1 [P ]mn = e π . Multiplying by [P ]i for any i, 1 ≤ i < m, we get limn→1 [P ]mn+i = e π , so limn→1 [P ]n = e π . Theorem 4.7 generalizes easily to an ergodic unichain (see Exercise 4.15). In this case, as one might suspect, πi = 0 for each transient state i and πi > 0 within the ergodic class. Theorem 4.7 becomes: Theorem 4.8. Let [P ] be the transition matrix of an ergodic unichain. Then ∏ = 1 is the largest real eigenvalue of [P ], and ∏ > |µ| for every other eigenvalue µ. Furthermore, lim [P ]m = eπ , m→1 (4.28) where π ≥ 0 is the unique probability vector satisfying π [P ] = π and e = (1, 1, . . . , 1)T is the unique ∫ (within a scale factor) satisfying [P ]∫ = ∫ . If a chain has a periodic recurrent class, [P ]m never converges. The existence of a unique probability vector solution to π [P ] = π for a periodic recurrent chain is somewhat mystifying at ﬁrst. If the period is d, then the steady-state vector π assigns probability 1/d to each of the d subsets of Theorem 4.3. If the initial probabilities for the chain are chosen as Pr {X0 = i} = πi for each i, then for each subsequent time n, Pr {Xn = i} = πi . What is happening is that this initial probability assignment starts the chain in each of the d subsets with probability 1/d, and subsequent transitions maintain this randomness over n subsets. On the other hand, [P ]n cannot converge because Pii , for each i, is zero except when n is a multiple of d. Thus the memory of starting state never dies out. An ergodic Markov chain does not have this peculiar property, and the memory of the starting state dies out (from Theorem 4.7). The intuition to be associated with the word ergodic is that of a process in which timeaverages are equal to ensemble-averages. Using the general deﬁnition of ergodicity (which is beyond our scope here), a periodic recurrent Markov chain in steady-state (i.e., with Pr {Xn = i} = πi for all n and i) is ergodic. Thus the notion of ergodicity for Markov chains is slightly diﬀerent than that in the general theory. The diﬀerence is that we think of a Markov chain as being speciﬁed without specifying the initial state distribution, and thus diﬀerent initial state distributions really correspond to diﬀerent stochastic processes. If a periodic Markov chain starts in steady state, then the corresponding stochastic process is stationary, and otherwise not. 4.5 Markov chains with rewards Suppose that each state i in a Markov chain is associated with some reward, ri . As the Markov chain proceeds from state to state, there is an associated sequence of rewards that are not independent, but are related by the statistics of the Markov chain. The situation is similar to, but simpler than, that of renewal-reward processes. As with renewal-reward processes, the reward ri could equally well be a cost or an arbitrary real valued function of the state. In this section, the expected value of the aggregate reward over time is analyzed. 4.5. MARKOV CHAINS WITH REWARDS 157 The model of Markov chains with rewards is surprisingly broad. We have already seen that almost any stochastic process can be approximated by a Markov chain. Also, as we saw in studying renewal theory, the concept of rewards is quite graphic not only in modeling such things as corporate proﬁts or portfolio performan...
View Full Document

## This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.

Ask a homework question - tutors are online