This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: On the Complexity and Approximation of Non-unique Probe Selection Using d-Disjunct Matrix My T. Thai * Taieb Znati Abstract In this paper, we studied the MINimum- d-Disjunct Submatrix (MIN- d-DS), which can be used to select the minimum number of non-unique probes for viruses identifi- cation. We prove that MIN- d-DS is NP-hard for any fixed d . Using d-disjunct matrix, we present an O (log k )-approximation algorithm where k is an upper bound on the maximum number of targets hybridized to a probe. We also present a (1+( d +1)log n )- approximation algorithm to identify at most d targets in the presence of experimental errors. Our approximation algorithms also yield a linear time complexity for the de- coding algorithms. Keywords: non-unique probe, non-adaptive group testing, pooling designs, d-disjunct matrix * Computer and Information Science and Engineering Department. University of Florida. Gainesville, FL, 32611. Email: email@example.com. Support in part by National Science Foundation under grant CCF-0548895. Department of Computer Science. University of Pittsburgh. Pittsburgh, PA, 15215. Email: firstname.lastname@example.org 1 1 Introduction Non-unique probe selection is a fundamental problem in computational molecular biology for target identification (Moret and Shapiro, 1985; Steinfath et al., 2000; Borneman et al., 2001; Wang and Seed, 2003; Rahmann, 2002, 2003; Gao et al., 2006; Thai et al., 2007; Li et al., 2005; Thai, 2007). A probe is a short oligonucleotide of size 8-25, used for identifying targets in a biological sample through hybridizations using DNA microarrays. A probe is called a unique probe if it hybridizes to only one specific target; otherwise, called a non-unique probe. Since unique probes have a strong separability of targets, identifying the presence of targets in a sample by using unique probes is straightforward. However, finding unique probes for every target is a difficult task due to the strong similarity of closely related targets. Considering a set of n targets and a sample containing at most d 1 of these targets, Schilep, Torney, and Rahman (Schliep et al., 2003) introduced a method using non-unique probes with group testing techniques to identify at most d targets in the following three steps: 1. Find a large set of non-unique probes as candidates and let a binary matrix M represent the probe-target hybridizations with rows labeled by probes and columns labeled by targets; that is, M [ i,j ] = 1 if probe p i hybridizes to target t j ; otherwise, M [ i,j ] = 0. 2. Select a minimum subset of probes obtained in Step 1 so that these probes can identify up to d targets. In other words, find a minimum submatrix H of M with the same number of columns....
View Full Document
This note was uploaded on 05/20/2011 for the course CAP 5515 taught by Professor Ungor during the Spring '08 term at University of Florida.
- Spring '08