Unformatted text preview: rediction ac curac y. Because AGV is c onstr ucted to be an interpolation bet ween LPA and a ring lattice, the AGV, LPA, and SMW mechan isms are equivalent in specific parameter regimes and c orrespondingly show a nonnegligible overlap. Nevertheless, the overall prediction ac curac y on the test sets still (Example subgraphs) Notes on procedure • Similar to techniques in social sciences (p∗, exponential random graph models). • Network “motifs”, Milo et al Science, 2002. But motifs only up to n = 3 or n = 4 nodes. • Note the term “clustering” here refers to machine learning technique to categorize data, not “clustering coefficient” (transitivity). Build classifier from the training data (Learning Algorithm) • Alternating Decision Tree (ADT), (Freund and Schapire, 1997). Which subgraphs best distinguish the models? Validating classifier • Slight overlap in models which are variations on one-another. Validating classifier • (a) DMC and RDG produce similar statistical distributions. • (b) Classifier can discriminate between the two models. After classifier built, use it to characterize individual network realizations (Walk the Drosophila data through the ADT) • A given network’s subgraph counts determine paths in the ADT (decision nodes are rectangles) • The ADT outputs a real-valued prediction score, which is the sum of all weights over all paths. • The final weight for a model is related to probability that particular network realization was generated by that model. • Model with the highest weight wins (best describes that particular network realization). • DMC wins for Giot Drosophila data! Comparison by subgraph counts • Green is best (same median occurrence as in real Drosophila data). • 0 means the subgraph is in data, but not in model. Introducing noise • Classifier robust. • Also robust to p = 0.5 and different subgraph counts, n = 7, 8. Comments • Model selection not validation. (Relative judgement) (i.e., which of these 7 models fits...
