This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Harvard SEAS ES250 Information Theory Entropy, relative entropy, and mutual information * 1 Entropy 1.1 Entropy of a random variable Definition The entropy of a discrete random variable X with pmf p X ( x ) is H ( X ) =- X x p ( x ) log p ( x ) The entropy measures the expected uncertainty in X . It has the following properties: H ( X ) 0, entropy is always non-negative. H ( X ) = 0 iff X is deterministic. Since H b ( X ) = log b ( a ) H a ( X ), we dont need to specify the base of the logarithm. 1.2 Joint entropy and conditional entropy Definition Joint entropy between two random variables X and Y is H ( X, Y ) ,- E p ( x,y ) [log p ( X, Y )] =- X x X X y Y p ( x, y ) log p ( x, y ) Definition Given a random variable X , the conditional entropy of Y (average over X ) is H ( Y | X ) ,- E p ( x ) [ H ( Y | X = x )] =- X x X p ( x ) H ( Y | X = x ) =- E p ( x ) E p ( y | x ) [log p ( Y | X )] =- E p ( x,y ) [log p ( Y | X )] Note: H ( X | Y ) 6 = H ( Y | X ). 1.3 Chain rule Joint and conditional entropy provide a natural calculus: Theorem (Chain rule) H ( X, Y ) = H ( X ) + H ( Y | X ) Corollary H ( X, Y | Z ) = H ( X | Z ) + H ( Y | X, Z ) * Based on Cover & Thomas, Chapter 2 1 Harvard SEAS ES250 Information Theory 2 Relative Entropy and Mutual Information 2.1 Entropy and Mutual Information Entropy H ( X ) is the uncertainty (self-information) of a single random variable Conditional entropy H ( X | Y ) is the entropy of one random variable conditional upon knowledge of another. We call the reduction in uncertainty mutual information : I ( X ; Y ) = H ( X )- H ( X | Y ) Eventually we will show that the maximum rate of transmission over a given channel p ( Y | X ), such that the error probability goes to zero, is given by the channel capacity : C = max p ( X ) I ( X ; Y ) Theorem Relationship between mutual information and entropy I ( X ; Y ) = H ( X )- H ( X | Y ) I ( X ; Y ) = H ( Y )- H ( Y | X ) I ( X ; Y ) = H ( X ) + H ( Y )- H ( X, Y ) I ( X ;...
View Full Document