Modularity CMSC 858L

Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look at over-represented functions within a module, e.g.: - If a majority of the proteins within a module have annotation A, predict annotation A for the other proteins in the module. Graph clustering methods - Min Multiway Cut, Graph Summarization, VI-Cut: examples we’ve already seen. - Methods often borrowed from other “community detection” applications.
Modularity Q = k i =1 e ii a 2 i Modularity is: e ii =|{(u,v) : u V i , v V i , (u,v) E}| / |E| a i =|{(u,v) : u V i , (u,v) E}| / |E| a i = % edges with at least 1 end in module i e ii = % edges in module i i 2 3 probability edge is in module i probability a random edge would fall into module i High modularity more edges within the module that you expect by chance.
Examples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Communities assigned to a random graph 1 2 3 4 5 6 7 8 9 10 Communities Assigned to a small graph Note: maximizing modularity will find it’s own # of clusters

Modularity Algorithm #1
