This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: EE 376A Information Theory Prof. T. Weissman Thursday, January 21, 2010 Homework Set #2 (Due: Thursday, January 28, 2010) 1. Prove that (a) Data processing decreases entropy: If Y = f ( X ) then H ( Y ) ≤ H ( X ). [ Hint: expand H ( f ( X ) , X ) in two different ways.] (b) Data processing on side information increases entropy: If Y = f ( X ) then H ( Z  X ) ≤ H ( Z  Y ). (c) Assume Y and Z are conditionally independent given X , denoted as Y − X − Z . In other words, P { Y = y  X = x, Z = z } = P { Y = y  X = x } for all x ∈ X , y ∈ Y , z ∈ Z . Prove that H ( Z  X ) ≤ H ( Z  Y ). Solution: (a) H ( X, f ( X )) = H ( X ) + H ( f ( X )  X ) = H ( X ) H ( X, f ( X )) = H ( f ( X )) + H ( X  f ( X )) H ( f ( X )  X ) = 0 since for any particular value of X , f ( X ) is fixed, and hence H ( f ( X )  X ) = ∑ x p ( x ) H ( f ( X )  X = x ) = ∑ x 0 = 0. Since H ( X  f ( X )) ≥ 0, we must have H ( f ( X )) ≤ H ( X ). (b) First, let us show that conditioning reduces conditional entropy: ≤ I ( X 1 ; X 2  X 3 ) = H ( X 1  X 3 ) − H ( X 1  X 2 , X 3 ) Thus H ( X 1  X 3 ) ≥ H ( X 1  X 2 , X 3 ). For part (b) H ( Z  X ) = H ( Z  X, Y ) + I ( Z ; Y  X ) ( i ) = H ( Z  X, Y ) ( ii ) ≤ H ( Z  Y ) where (i) comes from the fact Y = f ( X ) and (ii) comes from conditioning reduces conditional entropy. (c) Y − X − Z implies that p ( y, z  x ) = p ( y  x ) p ( z  x ). I ( Z ; Y  X ) = summationdisplay x,y,z p ( x, y, z ) log p ( y, z  x ) p ( y  x ) p ( z  x ) = 0 . Now we can follow the steps in (b) to complete the proof. 1 2. Entropy of a disjoint mixture. Let X 1 and X 2 be discrete random variables drawn according to probability mass functions p 1 ( · ) and p 2 ( · ) over the respective alphabets X 1 = { 1 , 2 , . . ., m } and X 2 = { m + 1 , . . ., n } . Let Θ be the result of a biased coin flip, i.e., P { Θ = 1 } = α and P { Θ = 0 } = 1 − α . X 1 , X 2 and Θ are mutually independent. X = braceleftbigg X 1 , if Θ = 1 , X 2 , if Θ = 0 . (a) Find H ( X ) in terms of H ( X 1 ) and H ( X 2 ) and α. (b) Maximize over α to show that 2 H ( X ) ≤ 2 H ( X 1 ) + 2 H ( X 2 ) and interpret using the notion that 2 H ( X ) is the effective alphabet size. Solution: (a) We can do this problem by writing down the definition of entropy and expanding the various terms. Instead, we will use the algebra of entropies for a simpler proof. Since X 1 and X 2 have disjoint support sets, Θ is a function of X : Θ = f ( X ) = braceleftbigg 1 when X ∈ X 1 when X ∈ X 2 Then, we have H ( X ) = H ( X, f ( X )) = H (Θ) + H ( X  Θ) = H (Θ) + p (Θ = 1) H ( X  Θ = 1) + p (Θ = 0) H ( X  Θ = 0) = H ( α ) + αH ( X 1 ) + (1 − α ) H ( X 2 ) where H ( α ) = − α log α − (1 − α ) log(1 − α )....
View
Full
Document
This note was uploaded on 10/27/2010 for the course ECE 221 taught by Professor Sd during the Spring '10 term at HustonTillotson.
 Spring '10
 sd

Click to edit the document details