17
Undirected Graphical Models
17.1
Introduction
A graph consists of a set of vertices (nodes), along with a set of edges join
ing some pairs of the vertices. In graphical models, each vertex represents
a random variable, and the graph gives a visual way of understanding the
joint distribution of the entire set of random variables. They can be use
ful for either unsupervised or supervised learning. In an
undirected graph
,
the edges have no directional arrows. We restrict our discussion to undi
rected graphical models, also known as
Markov random fields
or
Markov
networks
. In these graphs, the absence of an edge between two vertices has
a special meaning: the corresponding random variables are conditionally
independent, given the other variables.
Figure 17.1 shows an example of a graphical model for a ﬂowcytometry
dataset with
p
= 11 proteins measured on
N
= 7466 cells, from Sachs
et al. (2003). Each vertex in the graph corresponds to the realvalued ex
pression level of a protein. The network structure was estimated assuming
a multivariate Gaussian distribution, using the graphical lasso procedure
discussed later in this chapter.
Sparse graphs have a relatively small number of edges, and are convenient
for interpretation. They are useful in a variety of domains, including ge
nomics and proteomics, where they provide rough models of cell pathways.
Much work has been done in defining and understanding the structure of
graphical models; see the Bibliographic Notes for references.
© Springer Science+Business Media, LLC 2009
T. Hastie et al.,
The Elements of Statistical Learning, Second Edition,
625
DOI: 10.1007/b94608_17,
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
626
17.
Undirected Graphical Models
Raf
Mek
Plcg
PIP2
PIP3
Erk
Akt
PKA
PKC
P38
Jnk
FIGURE 17.1.
Example of a sparse undirected graph, estimated from a ﬂow
cytometry dataset, with
p
= 11
proteins measured on
N
= 7466
cells. The net
work structure was estimated using the graphical lasso procedure discussed in this
chapter.
As we will see, the edges in a graph are parametrized by values or
po
tentials
that encode the strength of the conditional dependence between
the random variables at the corresponding vertices. The main challenges in
working with graphical models are model selection (choosing the structure
of the graph), estimation of the edge parameters from data, and compu
tation of marginal vertex probabilities and expectations, from their joint
distribution. The last two tasks are sometimes called
learning
and
inference
in the computer science literature.
We do not attempt a comprehensive treatment of this interesting area.
Instead, we introduce some basic concepts, and then discuss a few sim
ple methods for estimation of the parameters and structure of undirected
graphical models; methods that relate to the techniques already discussed
in this book. The estimation approaches that we present for continuous
and discretevalued vertices are different, so we treat them separately. Sec
tions 17.3.1 and 17.3.2 may be of particular interest, as they describe new,
regressionbased procedures for estimating graphical models.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 Haulk
 Probability theory, undirected graphical models

Click to edit the document details