Unformatted text preview: rediction ac curac y. Because AGV is c onstr ucted to be
an interpolation bet ween LPA and a ring lattice, the AGV, LPA,
and SMW mechan isms are equivalent in specific parameter
regimes and c orrespondingly show a nonnegligible overlap.
Nevertheless, the overall prediction ac curac y on the test sets still (Example subgraphs) Notes on procedure
• Similar to techniques in social sciences (p∗, exponential
random graph models).
• Network “motifs”, Milo et al Science, 2002. But motifs only up
to n = 3 or n = 4 nodes.
• Note the term “clustering” here refers to machine learning
technique to categorize data, not “clustering coefﬁcient”
(transitivity). Build classiﬁer from the training data (Learning Algorithm)
• Alternating Decision Tree (ADT), (Freund and Schapire, 1997). Which subgraphs best
distinguish the models? Validating classiﬁer • Slight overlap in models which are variations on oneanother. Validating classiﬁer • (a) DMC and RDG produce similar statistical distributions.
• (b) Classiﬁer can discriminate between the two models. After classiﬁer built, use it to characterize individual
network realizations
(Walk the Drosophila data through the ADT)
• A given network’s subgraph counts determine paths in the ADT
(decision nodes are rectangles)
• The ADT outputs a realvalued prediction score, which is the
sum of all weights over all paths.
• The ﬁnal weight for a model is related to probability that
particular network realization was generated by that model.
• Model with the highest weight wins (best describes that
particular network realization).
• DMC wins for Giot Drosophila data! Comparison by subgraph counts • Green is best (same median occurrence as in real Drosophila
data).
• 0 means the subgraph is in data, but not in model. Introducing noise • Classiﬁer robust.
• Also robust to p = 0.5 and different subgraph counts, n = 7, 8. Comments
• Model selection not validation. (Relative judgement)
(i.e., which of these 7 models ﬁts...
