hw2sol - EE 376A Information Theory Prof. T. Weissman...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: EE 376A Information Theory Prof. T. Weissman Thursday, January 21, 2010 Homework Set #2 (Due: Thursday, January 28, 2010) 1. Prove that (a) Data processing decreases entropy: If Y = f ( X ) then H ( Y ) ≤ H ( X ). [ Hint: expand H ( f ( X ) , X ) in two different ways.] (b) Data processing on side information increases entropy: If Y = f ( X ) then H ( Z | X ) ≤ H ( Z | Y ). (c) Assume Y and Z are conditionally independent given X , denoted as Y − X − Z . In other words, P { Y = y | X = x, Z = z } = P { Y = y | X = x } for all x ∈ X , y ∈ Y , z ∈ Z . Prove that H ( Z | X ) ≤ H ( Z | Y ). Solution: (a) H ( X, f ( X )) = H ( X ) + H ( f ( X ) | X ) = H ( X ) H ( X, f ( X )) = H ( f ( X )) + H ( X | f ( X )) H ( f ( X ) | X ) = 0 since for any particular value of X , f ( X ) is fixed, and hence H ( f ( X ) | X ) = ∑ x p ( x ) H ( f ( X ) | X = x ) = ∑ x 0 = 0. Since H ( X | f ( X )) ≥ 0, we must have H ( f ( X )) ≤ H ( X ). (b) First, let us show that conditioning reduces conditional entropy: ≤ I ( X 1 ; X 2 | X 3 ) = H ( X 1 | X 3 ) − H ( X 1 | X 2 , X 3 ) Thus H ( X 1 | X 3 ) ≥ H ( X 1 | X 2 , X 3 ). For part (b) H ( Z | X ) = H ( Z | X, Y ) + I ( Z ; Y | X ) ( i ) = H ( Z | X, Y ) ( ii ) ≤ H ( Z | Y ) where (i) comes from the fact Y = f ( X ) and (ii) comes from conditioning reduces conditional entropy. (c) Y − X − Z implies that p ( y, z | x ) = p ( y | x ) p ( z | x ). I ( Z ; Y | X ) = summationdisplay x,y,z p ( x, y, z ) log p ( y, z | x ) p ( y | x ) p ( z | x ) = 0 . Now we can follow the steps in (b) to complete the proof. 1 2. Entropy of a disjoint mixture. Let X 1 and X 2 be discrete random variables drawn according to probability mass functions p 1 ( · ) and p 2 ( · ) over the respective alphabets X 1 = { 1 , 2 , . . ., m } and X 2 = { m + 1 , . . ., n } . Let Θ be the result of a biased coin flip, i.e., P { Θ = 1 } = α and P { Θ = 0 } = 1 − α . X 1 , X 2 and Θ are mutually independent. X = braceleftbigg X 1 , if Θ = 1 , X 2 , if Θ = 0 . (a) Find H ( X ) in terms of H ( X 1 ) and H ( X 2 ) and α. (b) Maximize over α to show that 2 H ( X ) ≤ 2 H ( X 1 ) + 2 H ( X 2 ) and interpret using the notion that 2 H ( X ) is the effective alphabet size. Solution: (a) We can do this problem by writing down the definition of entropy and expanding the various terms. Instead, we will use the algebra of entropies for a simpler proof. Since X 1 and X 2 have disjoint support sets, Θ is a function of X : Θ = f ( X ) = braceleftbigg 1 when X ∈ X 1 when X ∈ X 2 Then, we have H ( X ) = H ( X, f ( X )) = H (Θ) + H ( X | Θ) = H (Θ) + p (Θ = 1) H ( X | Θ = 1) + p (Θ = 0) H ( X | Θ = 0) = H ( α ) + αH ( X 1 ) + (1 − α ) H ( X 2 ) where H ( α ) = − α log α − (1 − α ) log(1 − α )....
View Full Document

This note was uploaded on 10/27/2010 for the course ECE 221 taught by Professor Sd during the Spring '10 term at Huston-Tillotson.

Page1 / 8

hw2sol - EE 376A Information Theory Prof. T. Weissman...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online