Similarity Network Fusion.pdf - Articles Similarity network...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
ARTICLES NATURE METHODS | VOL.11  NO.3  | MARCH 2014  | 333 and focusing only on common patterns can miss valuable complementary information. One recent machine-learning approach, iCluster 7 , uses a joint latent variable model for integrative clustering. Though powerful, iCluster and related machine-learning approaches 4 do not scale to the full spectrum of available measurements, making the methods sensitive to the gene preselection step. Our SNF approach is distinct in that it uses networks of sam- ples as a basis for integration. For example, when combining data from patient samples, SNF creates a patient network. Although networks of individuals have been extensively studied in other contexts, most notably in social science 8 or in relation to disease 9 , to our knowledge patient-similarity networks have not been used specifically for integrating biological data. SNF consists of two main steps: construction of a sample-similarity network for each data type and integration of these networks into a single similarity network using a nonlinear combination method. The fused network captures both shared and complementary information from different data sources ( Supplementary Results and Supplementary Figs. 1 3 ), offering insight into how inform- ative each data type is to the observed similarity between samples. Because it is based on networks of samples, SNF can derive use- ful information even from a small number of samples, is robust to noise and data heterogeneity, and scales to a large number of genes. In addition to integrating data, our fused networks can effi- ciently identify subtypes among existing samples by clustering and predict labels for new samples based on the constructed network Rapidly evolving technologies are making it progressively easier to collect multiple and diverse genome-scale data sets to address clinical and biological questions. For example, large-scale efforts by The Cancer Genome Atlas (TCGA) have already amassed genome, transcriptome and epigenome information for over 20 cancers from thousands of patients. The availability of such a wealth of data makes integrative methods essential for capturing the heterogeneity of biological processes and phenotypes, lead- ing to, for example, the identification of homogeneous subtypes in breast cancer. Data-integration methods need to overcome at least three computational challenges: (i) the small number of samples compared to the large number of measurements; (ii) the differences in scale, collection bias and noise in each data set, and (iii) the complementary nature of the information provided by different types of data. Current integration approaches have yet to address all of these challenges together 1–4 . The simplest way to combine biological data is to concatenate normalized measurements from various biological domains, such as mRNA expression and DNA methylation, for each sam- ple. Unfortunately, concatenation further dilutes the already low signal-to-noise ratio in each data type. To avoid this, a common
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern