MSA_gusfield - 1 Multiple String Alignment Efficient...

Info icon This preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Multiple String Alignment Efficient methods for multiple sequence alignment with guaranteed error bounds Dan Gusfield 1 Computer Science Division University of California, Davis July, 1991 Abstract Multiple string (sequence) alignment is a difficult and important problem in computa- tional biology, where it is central in two related tasks: finding highly conserved subregions or embedded patterns of a set of biological sequences (strings of DNA, RNA or amino acids), and inferring the evolutionary history of a set of taxa from their associated biolog- ical sequences. Several precise measures have been proposed for evaluating the goodness of a multiple alignment, but no efficient methods are known which compute the optimal alignment for any of these measures in any but small cases. In this paper, we consider two previously proposed measures, and give two computationaly efficient multiple alignment methods (one for each measure) whose deviation from the optimal value is guaranteed to be less than a factor of two. This is the novel feature of these methods, but the methods have additional virtues as well. For both methods, the guaranteed bounds are much smaller than two when the number of strings is small (1.33 for three strings of any length); for one of the methods we give a related randomized method which is much faster and which gives, with high probability, multiple alignments with fairly small error bounds; and for the other measure, the method given yields a non-obvious lower bound on the value of the optimal alignment. 2 Introduction Multiple string (sequence) alignment is a difficult problem of great value in computational biology, where it is central in two related tasks: finding highly conserved subregions or em- bedded patterns of a set of biological sequences (strings of DNA, RNA or amino acids); and inferring the evolutionary history of a set of taxa from their associated biological sequences. In the first case, a conserved pattern may be so dissimilar or dispersed in the strings that it cannot be detected by statistical tests when just two strings of the set are aligned, but the pattern becomes clear and compelling when many strings are simultaneously aligned. Scores of papers have been written on methods for multiple string alignment, and hundreds 1 Research partially supported by grant DE-FG03-90ER60999 from the Department of Energy, and grant CCR-8803704 from the National Science Foundation. 1
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
of papers have used various multiple alignment methods to find patterns or build evolu- tionary trees from biological sequence data. The following few papers illustrate this broad literature: [6, ? , 2, 4, ? , 11, 15, ? , 1, 10, 3]. Many of the suggested methods build a multiple alignment by attempting to optimize some explicitly or implicitly stated measure of goodness of the alignment. However, no single measure or objective function has yet been proposed that is widely agreed upon (unlike the case of aligning just two strings), and some proposed methods build alignments without relying (even implicitly) on any measure of goodness.
Image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern