hw2_sol - EE596A Introduction to Information Theory...

This preview shows pages 1–3. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: EE596A Introduction to Information Theory University of Washington Winter 2004 Dept. of Electrical Engineering Handout 5: Problem Set 2: Solutions Prof: Jeff A. Bilmes <[email protected]> Lecture 10, Feb 8, 2004 5.1 Problems from Text Book Problems Do problems 2.22,2.28,3.2,3.3,3.5,12.3 in C&T. Problem 2.22 Bottleneck. Suppose a (non-stationary) Markov chain starts in one of n states, necks down to k < n states, and then fans back to m > k states. Thus X 1 → X 2 → X 3 , X 1 ∈ { 1 , 2 , . . . , n } , X 2 ∈ { 1 , 2 , . . . , k } , X 3 ∈ { 1 , 2 , . . . , m } . 1. Show that the dependence of X 1 and X 3 is limited by the bottleneck by proving that I ( X 1 ; X 3 ) ≤ log k. 2. Evaluate I ( X 1 ; X 3 ) for k = 1 , and conclude that no dependence can survive such a bottleneck. Solution 2.22 Bottleneck. 1. From the data processing inequality, and the fact that entropy is maximum for a uniform distribution, we get I ( X 1 ; X 3 ) ≤ I ( X 1 ; X 2 ) = H ( X 2 )- H ( X 2 X 1 ) ≤ H ( X 2 ) ≤ log k. Thus, the dependence between X 1 and X 3 is limited by the size of the bottleneck. That is I ( X 1 ; X 3 ) ≤ log k . 2. For k = 1 , I ( X 1 ; X 3 ) ≤ log 1 = 0 and since I ( X 1 , X 3 ) ≥ , I ( X 1 , X 3 ) = 0 . Thus, for k = 1 , X 1 and X 3 are independent. Problem 2.28 Mixing increases entropy. Show that the entropy of the probability distribution, ( p 1 , . . . , p i , . . . , p j , . . . , p m ) , is less than the entropy of the distribution ( p 1 , . . . , p i + p j 2 , . . . , p i + p j 2 , . . . , p m ) . Show that in general any transfer of probability that makes the distribution more uniform increases the entropy. Solution 2.28 Mixing increases entropy. This problem depends on the convexity of the log function. Let P 1 = ( p 1 , . . . , p i , . . . , p j , . . . , p m ) P 2 = ( p 1 , . . . , p i + p j 2 , . . . , p j + p i 2 , . . . , p m ) 5-1 5-2 Then, by the log sum inequality, H ( P 2 )- H ( P 1 ) =- 2( p i + p j 2 ) log( p i + p j 2 ) + p i log p i + p j log p j =- ( p i + p j ) log( p i + p j 2 ) + p i log p i + p j log p j ≥ . Thus, H ( P 2 ) ≥ H ( P 1 ) . Problem 3.2 An AEP-like limit. Let X 1 , X 2 , . . . be i.i.d. drawn according to probability mass function p ( x ) . Find lim n →∞ [ p ( X 1 , X 2 , . . . , X n )] 1 n . Solution 3.2 An AEP-like limit . X 1 , X 2 , . . . , i.i.d. ∼ p ( x ) . Hence log( X i ) are also i.i.d. and lim( p ( X 1 , X 2 , . . . , X n )) 1 n = lim 2 log( p ( X 1 ,X 2 ,...,X n )) 1 n = 2 lim 1 n ∑ log p ( X i ) a.e. = 2 E (log( p ( X ))) a.e. = 2- H ( X ) a.e. by the strong law of large numbers (assuming of course that H(X) exists). Problem 3.3 The AEP and source coding. A discrete memoryless source emits a sequence of statistically independent binary digits with probabilities p (1) = 0 . 005 and p (0) = 0 . 995 . The digits are taken 100 at a time and a binary codeword is provided for every sequence of 100 digits containing three or fewer ones....
View Full Document

{[ snackBarMessage ]}

Page1 / 11

hw2_sol - EE596A Introduction to Information Theory...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online