information

# information - Introduction to Information Theory...

This preview shows pages 1–2. Sign up to view the full content.

1 Introduction to Information Theory Information must not be confused with meaning. “The semantic aspects of communications are irrelevant to the engineering aspects”. [Sh48]. Information is a measure of one’s freedom of choice and is measured by the logarithm of the number of choices. Tossing of a coin gives two choices . If the logarithm is with respect to base 2, we have unit information called a “bit”. With doubling of choices you have an extra bit of information. Thus 4,8,16 choices lead to 2, 3, 4 bits, respectively, of information. In general if you have N choices, the information content of the situation is ⎡⎤ N 2 log , which is the number of binary digits to encode the number N. The above situation can be captured by using probability . Since each event is assumed to be independent, the probability of the i th ( 1 i N ) event is p i =1/N ( All events are assumed to be equally probable) and the amount of information associated with the occurrence of this event or self-information is given by i p log . If p i = 1 then the information is zero (certainty) and if p i = 0 , it is infinity; if p i equals 0.5, it is one bit corresponding to N=2. If N=4, p i =0.25 and the information is 2 bits and so on. Note in the case of tossing of a coin there are two possible events: head or tail If you consider the tossing of the coin to be an “experiment”, the question is how much total information will this experiment have? This can be quantified if we can describe the outcome of the experiment in some reasonable fashion. Lets “encode” the outcome ‘head’ to be represented by the bit 1 and outcome ‘tail’ by the bit 0. Thus, a minimal description of this experiment needs only one bit. Note the experiment is the sum total of all the events. If we take the self-information of each event, multiply this by its probability and sum it up over all the events, intuitively that gives a measure of information content or average information of the experiment . It just so happens that this entity is also just one bit for the tossing event since the probability of either head or tail is 0.5 and self information for each event is also 1 bit. This 1 bit also expresses how uncertain we are of the outcome. How do you generalize the definition? Suppose, we have a set of N events whose probabilities of occurrence are p 1 , p 2 ,…,p N . Can we measure how much “choice” is involved or how much uncertain we are of the outcome? Such a measure is precisely the entropy of the experiment or “source” denoted as H(p 1 , p 2 ,…,p N ). [More precisely, it is called the first order entropy . Higher order entropies depend on contextual information. The true entropy is infinite order entropy. But,by popular use, entropy most often refers to first order entropy unless stated otherwise. Read the discussion from Sayood pp.14-16].

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 06/09/2011 for the course CAP 5015 taught by Professor Mukherjee during the Spring '11 term at University of Central Florida.

### Page1 / 14

information - Introduction to Information Theory...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online