This preview shows pages 1–2. Sign up to view the full content.
InformationTheoretic Strategies for Quantifying Variability
and ModelReality Comparison in the Climate System
1
,
2
,
3
J. W. Larson
1
Mathematics and Computer Science Division, Argonne National Laboratory
9700 S. Cass Avenue, Argonne, IL 60439, USA
EMail:
[email protected]
2
Computation Institute, University of Chicago
Chicago, IL, USA
3
Department of Computer Science, The Australian National University
Canberra ACT 0200, Australia
Keywords:
Information Theory; Statistics; Climate Data Analysis
Modelreality comparison can be viewed in a
communications context.
In this analogy, the
observed “real” data are a sent message, and the model
output are the received message. The model plays
the role of a noisy channel over which the message
is transmitted (Figure 1).
Information theory offers a way to assess literally
the “information content” of any system, and offers
a means for objective quantification of model
observational data fidelity.
The Shannon entropy
(SE)
H
(
X
)
is the measure of the amount of
uncertainty, variability, or “surprise” present in a
system variable
X
, while the Mutual Information (MI)
I
(
X, Y
)
measures the amount of shared information
or redundancy between two variables
X
and
Y
.
Information theory’s roots lie in the analysis of
communication of data across a noisy channel (Figure
1), and offer a scheme for quantifying how well
a message
X
coming from a transmitter arrives as
Y
at the receiver.
A more general information
theoretic measure of message degradation is the
KullbackLeibler Divergence (KLD), which quan
tifies insufficiency of agreement in the probatility
desnity functions associated with
X
and
Y
.
The
ratio of MI to SE yields the amount of information
shared by two datasets versus the information content
of one alone. Alas, the aforementioned information
theoretic techniques work best for discrete rather than
continuous systems. This is because evaluation of
the Shannon Entropy (SE) for continuous systems–the
differential entropy–does not constitute the continuum
limit of the SE. Relative quantities such as the MI
and KLD are always valid in the continuum case, and
are the continuum limit of their discrete counterparts,
but are just that–
relative
.
This begs the question:
Is there some way I can benchmark it against some
continuum surrogate for the SE? Thus, one faces
a choice when using information theory for model
validation and intercomparison: (1) adopt coarse
graining strategies that are physically relevant, always
aware that computed SE results re specific to a given
discretisation; or (2) treat the data as continuous and
use the MI combined with some benchmark quantity.
In this paper, I adopt strategy (1), and restrict scope to
a variable that has wellagreedupon discretisations—
total cloud cover, which by observational convention
is frequently coarsegrained by oktas, tenths, or
percent.
I review basic concepts from information theory. I
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 02/05/2012 for the course EE EE308 taught by Professor B.k.dey during the Spring '09 term at IIT Bombay.
 Spring '09
 B.K.Dey

Click to edit the document details