This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Harvard SEAS ES250 – Information Theory Entropy, relative entropy, and mutual information * 1 Entropy 1.1 Entropy of a random variable Definition The entropy of a discrete random variable X with pmf p X ( x ) is H ( X ) =- X x p ( x ) log p ( x ) The entropy measures the expected uncertainty in X . It has the following properties: • H ( X ) ≥ 0, entropy is always non-negative. H ( X ) = 0 iff X is deterministic. • Since H b ( X ) = log b ( a ) H a ( X ), we don’t need to specify the base of the logarithm. 1.2 Joint entropy and conditional entropy Definition Joint entropy between two random variables X and Y is H ( X, Y ) ,- E p ( x,y ) [log p ( X, Y )] =- X x ∈X X y ∈Y p ( x, y ) log p ( x, y ) Definition Given a random variable X , the conditional entropy of Y (average over X ) is H ( Y | X ) ,- E p ( x ) [ H ( Y | X = x )] =- X x ∈X p ( x ) H ( Y | X = x ) =- E p ( x ) E p ( y | x ) [log p ( Y | X )] =- E p ( x,y ) [log p ( Y | X )] Note: H ( X | Y ) 6 = H ( Y | X ). 1.3 Chain rule Joint and conditional entropy provide a natural calculus: Theorem (Chain rule) H ( X, Y ) = H ( X ) + H ( Y | X ) Corollary H ( X, Y | Z ) = H ( X | Z ) + H ( Y | X, Z ) * Based on Cover & Thomas, Chapter 2 1 Harvard SEAS ES250 – Information Theory 2 Relative Entropy and Mutual Information 2.1 Entropy and Mutual Information • Entropy H ( X ) is the uncertainty (“self-information”) of a single random variable • Conditional entropy H ( X | Y ) is the entropy of one random variable conditional upon knowledge of another. • We call the reduction in uncertainty mutual information : I ( X ; Y ) = H ( X )- H ( X | Y ) • Eventually we will show that the maximum rate of transmission over a given channel p ( Y | X ), such that the error probability goes to zero, is given by the channel capacity : C = max p ( X ) I ( X ; Y ) Theorem Relationship between mutual information and entropy I ( X ; Y ) = H ( X )- H ( X | Y ) I ( X ; Y ) = H ( Y )- H ( Y | X ) I ( X ; Y ) = H ( X ) + H ( Y )- H ( X, Y ) I ( X ;...
View Full Document
This note was uploaded on 12/01/2010 for the course ADLAC 1023 at Stanford.