This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Network Motifs: Simple Building
Blocks of Complex Networks
Milo et al., Science, 2002. Beyond Degree Distribution & Diameter
Network Motifs: Consider all possible ways to
connect 3 nodes with directed edges: (Milo et al., Science, 2002) Finding Overrepresented Subgraphs
For each possible motif M:
Let cM be the number of times M occurs in graph G.
Estimate pM = Pr[# occurrences ≥ cM] when edges are shufﬂed.
Output M if pM < 0.01 and cM > 4. Single and double edges
swapped separately: b a b c To generate a random graph
for the 3node motifs: a d c d a b a b c d c d To Generate Random Graphs With a Given
Distribution of (n1)node subgraphs:
Deﬁne an “energy” on a vector of occurrences of motifs: Vreal,M − Vrand,M 
Energy(Vrand ) =
(Vreal,M − Vrand,M )
+
M When Vrand = Vreal, the energy is 0.
Start with a randomized network.
Until Energy is small:
Make a random swap.
If the swap reduces the energy, keep it
Otherwise, keep it with probability exp(ΔE/T) I
I
E I “Information
processing”
networks tend to use
the same motifs
Other networks each
had their own
distinct collection of
motifs.
Feed forward, e.g.:
ﬁlter out transient
signals. (Milo et al., Science, 2002) Quickly Finding Motifs
858L Network Motif Discovery Using
Subgraph Enumeration and
SymmetryBreaking
Grochow & Kellis, RECOMB 2007 Backtracking (Recursive) Algorithm to Find
Network Motifs
H=
(Small query graph) G=
(Large Network) Def. Node g supports
node h if the degrees of
g and h are compatible. Backtracking (Recursive) Algorithm to Find
Network Motifs
H=
(Small query graph) Def. Node g supports
node h if the degrees of
g and h are compatible. f : VH → VG G=
(Large Network) Basic Algorithm:
For every possible
mapping of a single
node from G to H For each node g ∈ G
For each node h ∈ H
If h can’t support g: continue
Let f = {(g!h)}
L = Extend(f, G, H)
For q in L:
Output image of q
Remove g from G
No need to consider g again
(since we tried all its
possible matches already) f is a partial map that maps g to h.
Then grow this partial map
into many full maps q : VH → V G Extend(f, G, H):
If domain(f) = H: return [f] Base case Let m = some node in N(domain(f))
For each node u ∈ N(f(domain(f))):
If adding (m!u) to f keeps f as a
valid isomorphism then:
Extend(f∪{(m!u)}, G, H) m Choose a node in H
Try to map it to G N(domain(f)) g domain(f) u h
u f(domain(f)) N(f(domain(f))) Extend(f, G, H):
If domain(f) = H: return [f] Base case Let m = some node in N(domain(f))
For each node u ∈ N(f(domain(f))):
If adding (m!u) to f keeps f as a
valid isomorphism then:
Extend(f∪{(m!u)}, G, H) m Choose a node in H
Try to map it to G N(domain(f)) g domain(f) u h
u f(domain(f)) N(f(domain(f))) Speedup #1 • Every time we can choose a node, we pick the one
that is “most constrained”: • Pick the node that already has the most mapped
neighbors
If there are ties, choose the node with the highest degree
If there are still ties, choose the node with highest 2nd
order degree (total degree of the neighbors) Just a heuristic  doesn’t hurt because we can pick
the nodes in any order we want  if a map that we are building can’t be completed, we
want to know sooner rather than later. Def. An automorphism
is an isomorphism from
a graph to itself. Automorphisms & Orbits E A
C D
F B E A
D F C
B Orbit of a node u is
the set of nodes that
u is mapped to under
some automorphism Def. An automorphism
is an isomorphism from
a graph to itself. Automorphisms & Orbits E A
C D
F B E A
D F C
B Orbit of a node u is
the set of nodes that
u is mapped to under
some automorphism Main Speedup (#2) 1
Number each
node of G 4 2
3 6 5 Main Speedup (#2) A mapping f induces
a numbering on H f : VH → VG
1
Number each
node of G 4 2
3 6 5 Main Speedup (#2) 5
3
4 A mapping f induces
a numbering on H f : VH → VG
1
Number each
node of G 6 2
3 Main Speedup (#2)
If we add these constraints, we
get only one possible mapping A < min{B,C}
A3
C < min{B} B5
C 4 A mapping f induces
a numbering on H f : VH → VG
1
Number each
node of G 6 2
3 Adding Constraints, Larger Example (Figure from Grochow & Kellis, 2007) Basic Algorithm, differences for symmetry breaking
For each node g ∈ G
For each node h ∈ H s.t. we haven’t
considered q ∈ Orbit(h):
If h can’t support g: continue
Let f = {(g!h)}
L = Extend(f, G, H, CH)
For q in L:
Output image of q
Remove g from G
q : VH → V G Extend(f, G, H), symmetry breaking differences
If domain(f) = H: return [f]
Let m = some node in N(domain(f))
For each node u ∈ N(f(domain(f))):
If adding (m!u) to f keeps f as a
valid isomorphism
and (m!u) obeys the constraints
then:
Extend(f∪{(m!u)}, G, H)
m N(domain(f)) g u h
u domain(f) f(domain(f)) N(f(domain(f))) Results: Running Time (Figure from Grochow & Kellis, 2007) Results: Beneﬁt of Symmetry Breaking Really Large “motifs”? Meaningful?
(Figure from Grochow & Kellis, 2007) Occurred 27,720 times
in the real yeast PPI
network (but rarely in a
random network) Really just a subgraph
of this part of the yeast
PPI: choose 4 nodes
from the clique and 3
nodes from the oval. Other Advantages • Since symmetry breaking ensures each match is
output only once, they don’t need to keep track of
which graphs they’ve already output • save a lot of space Can be parallelized better Spiritual Similarity to Color Coding • Color Coding: make distinguishable things looks the
same • Symmetry Breaking: make indistinguishable things
look different. ...
View
Full
Document
 Fall '07
 staff

Click to edit the document details