This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: BIOINFORMATICS Vol. 20 no. 18 2004, pages 35943603 doi:10.1093/bioinformatics/bth448 Advances to Bayesian network inference for generating causal networks from observational biological data Jing Yu 1,2 , V. Anne Smith 1 , Paul P. Wang 2 , Alexander J. Hartemink 3, and Erich D. Jarvis 1, 1 Department of Neurobiology, Duke University Medical Center, Box 3209, Durham, NC 27710, USA, 2 Department of Electrical Engineering and 3 Department of Computer Science, Duke University, Durham, NC 27708, USA Received on March 4, 2004; revised on June 18, 2004; accepted on July 13, 2004 Advance Access publication July 29, 2004 ABSTRACT Motivation: Network inference algorithms are powerful com- putational tools for identifying putative causal interactions among variables from observational data. Bayesian network inference algorithms hold particular promise in that they can capture linear, non-linear, combinatorial, stochastic and other types of relationships among variables across multiple levels of biological organization. However, challenges remain when applying these algorithms to limited quantities of experimental data collected from biological systems. Here, we use a sim- ulation approach to make advances in our dynamic Bayesian network (DBN) inference algorithm, especially in the context of limited quantities of biological data. Results: We test a range of scoring metrics and search heuristics to find an effective algorithm configuration for evalu- ating our methodological advances. We also identify sampling intervals and levels of data discretization that allow the best recovery of the simulated networks. We develop a novel influ- ence score for DBNs that attempts to estimate both the sign (activation or repression) and relative magnitude of interac- tions among variables. When faced with limited quantities of observational data, combining our influence score with moderate data interpolation reduces a significant portion of false positive interactions in the recovered networks. Together, our advances allow DBN inference algorithms to be more effective in recovering biological networks from experimentally collected data. Availability: Source code and simulated data are available upon request. Contact: email@example.com; firstname.lastname@example.org; jarvis@ neuro.duke.edu Supplementary information: http://www.jarvislab.net/ Bioinformatics/BNAdvances/ To whom correspondence should be addressed. 1 INTRODUCTION A variety of network inference algorithms have recently been used to identify gene regulatory networks from observational gene expression data (Akutsu et al ., 2000; Arkin et al ., 1997; Dhaeseleer et al ., 1999; Friedman et al ., 2000; Gardner et al ., 2003; Hartemink et al ., 2001; Liang et al ., 1998; Weaver et al ., 1999; Xu et al ., 2002). Bayesian network (BN) inference algorithms have shown particular promise (Hartemink et al ., 2002; Husmeier, 2003; Smith et al ., 2002, 2003), because unlike most other modeling frameworks, they can capture many types of relationships between variables. Owing to theirmany types of relationships between variables....
View Full Document
- Spring '09