4.Initial conditions.Show, for a Markov chain, thatH(X0|Xn)≥H(X0|Xn-1).Thus, initial conditionsX0become more difficult to recover as the futureXnunfolds.Solution: Initial conditions.For a Markov chain, by the data processing theorem, we haveI(X0;Xn-1)≥I(X0;Xn).ThereforeH(X0)-H(X0|Xn-1)≥H(X0)-H(X0|Xn)orH(X0|Xn) increases withn.5.Stationary processes.Let. . . , X-1, X0, X1, . . .be a stationary (not necessarily Markov) stochastic process.Which of the following statements are true? Prove or provide a counterexample.(a)H(Xn|X0) =H(X-n|X0).(b)H(Xn|X0)≥H(Xn-1|X0).(c)H(Xn|X1, X2, . . . , Xn-1, Xn+1) is nonincreasing inn.(d)H(Xn|X1, . . . , Xn-1, Xn+1, . . . , X2n) is nonincreasing inn.3

Solution: Stationary processes.(a)H(Xn|X0) =H(X-n|X0).This statement is true, sinceH(Xn|X0)=H(Xn, X0)-H(X0)(1)H(X-n|X0)=H(X-n, X0)-H(X0)(2)andH(Xn, X0) =H(X-n, X0) by stationarity. (Note that Pr(Xn=a|X0=b)6=Pr(X0=a|Xn=b) in general.)(b)H(Xn|X0)≥H(Xn-1|X0).This statement is not true in general, though it is true for first order Markovchains.A simple counterexample is a periodic process with periodN.LetX0, X1, X2, . . . , XN-1be i.i.d.Bern(12) random variables and letXmN+k=Xkfork= 0, . . . , N-1 andm= 1,-1,2,-2, . . ..Note that this is a stationaryprocess. In this case, forn=mN,H(Xn|X0) = 0 andH(Xn-1|X0) = 1, contra-dicting the statementH(Xn|X0)≥H(Xn-1|X0).(c)H(Xn|Xn-11, Xn+1) is non-increasing inn.This statement is true, since by stationarity,H(Xn|Xn-11, Xn+1) =H(Xn+1|Xn2, Xn+2)≥H(Xn+1|Xn1, Xn+2),where the inequality follows from the fact that conditioning reduces entropy.(d)H(Xn+ 1|Xn1, X2n+1n+2) is non-increasing inn.This statement is true, since by stationarity,H(Xn+1|Xn1, X2n+1n+2) =H(Xn+2|Xn+12, X2n+2n+3)≥H(Xn+2|Xn+11, X2n+3n+3),where the inequality follows from the fact that conditioning reduces entropy.6.Recurrence times are insensitive to distributions.LetX0, X1, X2, . . .be drawn i.i.d.∼p(x), x∈ X={1,2, . . . , m}, and letNbe thewaiting time to the next occurrence ofX0. ThusN= min{n≥1 :Xn=X0}.4

(a) GivenX0=i, the expected time until we see it again is 1/p(i). Therefore,EN=E[E(N|X0)] =Xp(X0=i)1p(i)¶=m.(3)(b) By the same argument, since givenX0=i,Nhas a geometric distribution withmeanp(i) andE(N|X0=i) =1p(i).(4)Then using Jensen’s inequality, we haveElogN=Xip(X0=i)E(logN|X0=i)(5)≤Xip(X0=i) logE(N|X0=i)(6)=Xip(i) log1p(i)(7)=H(X).(8)(c) The property thatEN=mis essentially a combinatorial property rather than astatement about expectations. We prove this for stationary ergodic sources. Inessence, we will calculate the empirical average of the waiting time, and show thatthis converges tom. Since the process is ergodic, the empirical average convergesto the expected value, and thus the expected value must bem.

#### You've reached the end of your free preview.

Want to read all 11 pages?

- Spring '10
- sd
- Stochastic process, Markov chain, Xn, entropy rate, Ergodic theory