multiple-alignment

multiple-alignment - Multiple Sequence Alignment BMI/CS 576

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Multiple Sequence Alignment BMI/CS 576 www.biostat.wisc.edu/bmi576.html Colin Dewey cdewey@biostat.wisc.edu Fall 2011
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Multiple Sequence Alignment: Task Definition Given – a set of more than 2 sequences – a method for scoring an alignment Do: – determine the correspondences between the sequences such that the alignment score is maximized
Background image of page 2
Multiple Alignment of SH3 Domain Figure from A. Krogh, An Introduction to Hidden Markov Models for Biological Sequences
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Motivation for MSA establish input data for phylogenetic analyses determine evolutionary history of a set of sequences – At what point in history did certain mutations occur? discovering a common motif in a set of sequences (e.g. DNA sequences that bind the same protein) characterizing a set of sequences (e.g. a protein family) building profiles for sequence-database searching – PSI-BLAST generalizes a query sequence into a profile to search for remote relatives
Background image of page 4
Scoring a Multiple Alignment key issue: how do we assess the quality of a multiple sequence alignment? usually, the assumption is made that the individual columns of an alignment are independent we ll discuss two methods – sum of pairs (SP) – minimum entropy Score ( m ) = G + S ( m i i ) gap function score of i th column
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Scoring an Alignment: Sum of Pairs compute the sum of the pairwise scores S ( m i ) = s ( m i k k < l , m i l ) = k i m character of the k th sequence in the i th column substitution matrix = s
Background image of page 6
Scoring an Alignment: Minimum Entropy basic idea: try to minimize the entropy of each column another way of thinking about it: columns that can be communicated using few bits are good information theory tells us that an optimal code uses bits to encode a message of probability p – Frequently sent messages require few bits – Rarely sent messages require many bits p 2 log
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Scoring an Alignment: Minimum Entropy the messages in this case are the characters in a given column the entropy of a column is given by: 2 ( ) log i ia ia a S m c p = = i m = ia c = ia p the i th column of an alignment m count of character a in column i probability of character a in column i
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 31

multiple-alignment - Multiple Sequence Alignment BMI/CS 576

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online