{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Solution_3[1]

# Solution_3[1] - Chapter 3 The Asymptotic Equipartition...

This preview shows pages 1–8. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 3 The Asymptotic Equipartition Property 3 . 1. Markov’s inequality and Chebyshev’s inequality. (a) (Markov’s inequality.) For any non—negative random variable X and any 15 > O, Show that ' EX Pr{X 2 t} g T . (3-1) Exhibit a random variable that achieves this inequality with equality. (b) (Chebyshev’s inequality.) Let Y be a random variable with mean ,u and variance 02. By letting X = (Y — W2: show that for any 5 > 0, 2 c7 Pr{iY A ,u] > e} S 22—. (3.2) (c) (The weak law of large numbers.) Let Z1,Z2, . . . , Zn be a sequence of i.i.d. random variables with mean ,u and variance 02 . Let 7” : 372:; Zi be the sample mean. Show that 2 — or Pr {[2.1 — p.) > a} g E' (3.3) Thus Pr {liq — ,u‘ > e} —> 0 as n —> 00. This is known as the weak law of large numbers. Solution: Markov’s inequality and Chebyshev’s inequality. (a) If X has distribution F(a:), EX = ijdF D 5 00 / mdF + f mdF 0 6 49 50 (b) The Asymptotic Equipartition Property (3.4) One student gave a proof based on conditional expectations. It goes like EX IV IV which leads to (3.4) as well. Given 5 , the distribution achieving where n g 5 . 7 Letting X = (Y e W2 Pr{ PI{X 2 5} : Egg—C, X 3 5 With probability ’3” O with probability 1 —— 1'5, ' in Markov’s inequality, (Y - m2 > 62} é Pr{(Y - m2 2 62} < E<Y — m2 62 = 7, and noticing that Pr{(Y — .002 > 62} = Pr{|Y — JLL| > e}, we get, Letting Y in Chebysh E2” = a and Var(Z_n . 2 variance \$— ), we have1 0.2 Pr{|Y -4 ,LLI > e} 3 6—2. E(X1X g 6) Pr{X 2 5} + E(X|X < 5) Pr-[X < 5} me g 5) PIS-[X 2 5} 6Pr{X 3 a}, ev’s inequality from part {13) equal Zn, and noticing that ) = 0—: {ie. Zn is the sum of n iid r.v.’s, _ 0'2 P r < —. I'll-Zn nu'l > 5} — n62 5:; n 1 each with 50 _ t The Asymptotic Equipartition Property fodF 5 DO Z/édF r5 : 6Pr{X Z 6}. Rearranging sides and dividing by 6 we get, Pr{X 2 5} 3 E533 (3-4) One student: gave a proof based on conditional expectations. It goes like E(X|X g 6)Pr{X 2 a} + E(X1X < 5') Pr{X < a} 3 E(XlX g 6)PI{X 2 a} 3 5Pr{X 2 5}, ‘ EX which leads to (3.4) as well. Given 5 , the distribution achieving Pr{X 2 5} : E721; is X I 5 with probability g 0 with probability 1 H ‘5‘, where ,u g 6 . (b) Letting X = (Y H ,LL)2 in Markov’s inequality, /\ ”U H :2 “<1 I 7; V Ix: IV on M Wu Pr{(Y — m2 > 62} S and noticing that Pr{(Y —— ,LL)2 > 62} = PrﬂY — M > e}, we get, Pr{lY — a1> e} g 5:3. (c) Letting Y in Chebyshev’s inequality from part (b) equal Zn, and noticing that 2 Bin 2 ,u and Var{Z_n) = ”7; (ie. Zn is the sum of an iid r.v.’s, %, each with _ 2 variance ﬁr ), we have, 2 _ (r Pr{|Zn __ ,LLI > e} 3 Kg. The Asymptotic Equipartition Property 51 3, 2. AEP and mutual information. Let (Xth) be i.i..d. ~ p(.r,y). We form the log likelihood ratio of the hypothesis that X and Y are independent vs. the hypothesrs that X and Y are dependent. What is the limit of 310 p(X“)P(Y”)? n g p(X'n)Yn) Waum___,W,W_.;..__.__._____L=.i Solution: _1_1 _.__ n 0g p(X”,Y”) plX”)p(Y”) _ 1 O ” PtXimO/i} 1 “ PCXiipiyz‘) 41:1. —> Men —I(X;Y) Thus p 0&1)?le? —> 2—"I(X;Y), which will converge to 1 if X and Y are indeed I p 71-: n independent. 3‘ 6. An AEP-like limit. Let Xi,X2,... be i.i.d. drawn according to probability mass function Me). Find lirn [p(X1,X2,...,Xﬂ)}%. TIE—+00 Solution: An AEP-liite limit. X1,X2,..., i.i.d. ~p(:c). Hence log(Xi) are also Lid. and ' J. lim(P(X1,X2, . . . ,Xniﬁ = lim21°g(P(X1’X2"'~XnD" = 2ljm % 21°90“) as. = 23(10ECP(X))) as. = 2—H“) ae. by the strong law of large numbers (assuming of course that H(X) exists). g. 7. The AEP and source coding. A discrete memoryless source emits a sequence of statistically independent binary digits with probabilities 39(1) 2 0.005 and p(0) = 0.995. The digits are taken 100 at a time and a binary codeword is provided for every sequence of 100 digits containing three or fewer ones. (a) Assuming that all codewords are the same length, ﬁnd the minimum length re— quired to provide codewords for all sequences with three or fewer ones. (b) Calcuiate the probability of observing a source sequence for which no codeword has been assigned. (c) Use Chebyshev’s inequality to bound the probability of observing a source sequence for which no codeword has been assigned. Compare this bound with the actual probability computed in part (b). Solution: The AEP and source coding. (a) The number of 100—bit binary sequences with three or fewer ones is 100 100 100 100 0 + 1 + ‘2 + 3 =1+100+4950+161700=166751. The required codeword length is {logg 166751] 2 18. {Note that H(0.005) = 0.0454, so 18 is quite a bit larger than the 4.5 bits of entropy.) (b) The probability that a 100—bit sequence has three or fewer ones is ‘1 (100 L 1. ){0-005)5(0-995)1{’°‘i : 0.60577 + 0.30441 + 0.7579 + 0.01243 .—_ 0.99833 i=0 Thus the probability that the sequence that is generated cannot be encoded is l — 0.99833 : 0.00167. (0) In the case of a random variable .371 that is the sum of n i.i.d. random variables X1, X2, . . . ,Xn ,' Chebyshev’s inequality states that where ,u and 02 are the mean and variance of Xi. (Therefore no and 77.02 are the mean and variance of 3”.) In this problem, in = 100, p. I 0.005, and 0'2 :- -{0;005)(0;995). Note that 3100 Z 4 if and only if [5103 —- 100(02005N 2 3.5, so we should choose 5 = 3.5. Then 100031105) (0.995) m 0.04061 . (3.5)2 Pr(Smo Z 4) S This bound is much larger than the actual probability 0.00167. 4-, 3. Shuﬁies increase entropy. Argue that for any distribution on shufﬂes T and any distribution on card positions X that H(TX) 2 H(TX§T) (4.11) = H(T'1TX[T) (4.12) = H(X|T) (4.13) = H(X), (4.14) if X and T are independent. Solution: Shuﬁies increase entropy. H(TX) 3 H(TX[T) (4.15) : H(T‘1TX|T) (4.16) : H(XLT) (4.17) = H(X). (4.18) The inequality follows from the fact that conditioning reduces entropy and the ﬁrst equality follows from the fact that given T, we can reverse the shufﬂe. if" 6. Monotonicity of entfopy per element. For a stationary stochastic process X1, X2, . . . ,Xn, Show that (a) H{X1,X2,...,X,,) < H(X1,X2,...,Xn_1)- (4.51) n _ n — 1 (b) , “X " I“ 1’3": ”X?” 2 H<Xn|x _1_,,X1). (4.52) Solution: M anotonécity of entropy per element. (a) By the chain rule for entropy, H(X1,X2, . . . ,Xn) 3 1;, H(X,-1X%'-1) (4 53) n n _ I I mam-1) + 23;,1H(X,|X1~1) (4 54) n : H{anXn—l)+H(X12X27'--1Xn—1) (455) t n I - From stationarity it follows that for all 1 g i g n, H(X,,§X“’1) S H(X«;!Xi‘1), which further implies, by averaging both sides, that, 'r‘tﬁ-l _Xi«-l mmxnﬂ : i=1 if? ) (4.56) = H(X1,X2,...,Xn_1) (4 57) n a, l I . Combining (4.55) and (4.57) yields, HX X X, HX,X,...,X,,,_ ( 1} 2, 1 ) S ’3;[ ( 1 2 1)+H(X13X21"'3Xﬂ—1) n n n—l w; M (4-58) n—l (b) By stationarity we have for all 1 g i g n , H(anX”*1) g H(X,-|Xi—1), which implies that Elli H(X,,1X“”1) ngnlxntl) = n (4.59) s w (4.60) TL : H(X1:X23---)Xn). (4:61) 71 z}. 7. Entropy rates of Markov chains. (a) Find the entropy rate of the two-state Markov chain with transition matrix P = 1 4 p01 P01 . P10 1 — P10 (b) What values of 1301,1010 maximize the rate of part (a)? (c) Find the entropy rate of the two-state Markov chain with transition matrix _ 1 — r r P _ l 1 O l . (d) Find the maximum value of the entropy rate of the Markov chain of part (c) We expect that the maximizing value of p should be less than 1/2, since the 0 state permits more information to be generated than the 1 state. (e) Let N (t) be the number of allowable state sequences of length if for the Markov ' chain of part (c). Find N (t) and calculate 1 H0 2 11m —logN(t). t—>oo t Hint. Find a linear recurrence that expresses N (t) in terms of N (t i 1) and NOE — 2). Why is Hg an upper bound on the entropy rate of the l‘ﬂarkov chain? Compare Hg with the maximum entropy found in part (d). Solution: Entropy rates of M arkov chains. (a) The stationary distribution is easily calculated. (See EIT pp. 62763.) P10 P01 =——‘—’ 0:'—""——- MO 19014-1010 p P01+P10 Therefore the entropy rate is P10H(P01) + 1901 H (1910) 2 H = ——-—————- H(X2iX.l) 1101270901) + #1 (P10) 1901+ 19:0 (b) The entropy rate is at most 1 bit because the process has only two states. This rate can be achieved if (and only if) p01 2 p10 '74 1/2, in which case the process is actually i.i.d. with Pr(X.; : 0) = Pr(Xi = 1) = 1/2. (c) As a special case of the general two-state Markov chain, the entropy rate is HO?) H(X21X1) = #OHW + M1H(1)= 19+ 1- (d) By straightforward calculus, we ﬁnd that the maximum value of H (X) of part (6) occurs for p = (3 — \/_5_)/2 = 0.382. rI‘he maximum value is H(p) = H(1 — p) = H («—2— I) = 0.694 bits. Note that 5/5 — 1) /2 2 0.618 is (the reciprocal of) the Golden Ratio. (e) The Markov chain of part (c) forbids consecutive ones. Consider any allowable sequence of symbois of length t. If the ﬁrst symbol is 1, then the next symbol must be 0; the remaining N (t — 2) symbols can form any allowable sequence. If the ﬁrst symbol is 0, then the remaining N (t r 1) symbols can be any allowable sequence. So the number of allowable sequences of length 15 satisﬁes the recurrence Ne) : N(t — 1) + N0: — 2) N(1) -= 2, M2} : 3 (The initial conditions are obtained by observing that'for t = 2 only the sequence 11 is not allowed. We could also choose N (0) = 1 as an initial condition, since there is exactly one allowable sequence of length 0, namely, the empty sequence.) The sequence N (t) grows exponentially, that is, N (t) m CH, where A is the maximum magnitude solution of the characteristic equation 1=z_1+z_2. Solving the characteristic equation yields A = (1+ x/g)/2, the Golden Ratio. (The sequence {N (t)} is the sequence of Fibonacci numbers.) Therefore an = lim %logN(t) = log(l + x/S)/2 : 0.694 bits. ‘11—’00 Since there are only N (if) possible outcomes for X1, ,Xt, an upper bound on H(X1,. .. , X;) is log N (t) , and so the entropy rate of the Markov chain of part (c) is at most H0. In fact, we saw in part (d) that this upper bound can be achieved. 4-. 15. Entropy rate. Let {X.,} be a discrete stationary stochastic process with entropy rate HOV). Show 1 _ . gH(X,,,...,X1 | X0,X_1,...,ka) —> 57(2), (4.89) for k=1,2,.... Solution: Entropy rate of a stationary process. By the Cesaro mean theorem, the running average of the terms tends to the same limit as the limit of the terms. Hence 1 l n EH(X1,X2,...,anX0,X_1,...,X_k-) : EZH(X1lX—1:Xiﬁ21m:X‘vté4'90) i=1 we limH(Xn|Xn,1, Xn_2, . . . ,X_k{}4.91) = H, (4.92) the entropy rate of the process. ...
View Full Document

{[ snackBarMessage ]}

### Page1 / 8

Solution_3[1] - Chapter 3 The Asymptotic Equipartition...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online