alignment - Homology Sequence alignment how to discover...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Sequence alignment: how to discover similarities between biological sequences Chapter 6 in Jones and Pevzner Homology Genes or characters in organisms A and B that have evolved from the same ancestral gene or character are said to be homologs Homology between genes typically indicates conserved function Sequence similarity is used to infer homology Sequence Comparison: Early Success Story In 1983 Russell Doolittle and colleagues found similarities between a cancer-causing gene from the Simian Sarcoma virus and a normal growth factor gene (PDGF) Finding sequence similarities with genes of known function is a common approach to infer a newly sequenced gene’s function The drosophila “eyeless” gene W. Gehring discovered that turning on the “eyeless” gene in drosophila leads to the growth of ectopic eyes “eyeless” is a master control gene for eye formation (transcription factor) A similar gene in humans The aniridia gene in humans has a sequence that is similar to the drosophila eyeless gene Eye morphogenesis is under similar genetic control in vertebrates and insects 5 HSGVNQLGGVFVNGRPLPDSTRQKIVELAHSGARPCDISRILQVSNGCVS 54 ||||||||||||.||||||||||||||||||||||||||||||||||||| 57 HSGVNQLGGVFVGGRPLPDSTRQKIVELAHSGARPCDISRILQVSNGCVS 106 55 KILGRYYETGSIRPRAIGGSKPRVATPEVVSKIAQYKRECPSIFAWEIRD 104 ||||||||||||||||||||||||||.||||||:|||||||||||||||| 107 KILGRYYETGSIRPRAIGGSKPRVATAEVVSKISQYKRECPSIFAWEIRD 156 105 RLLSEGVCTNDNIPSVSSINRVLRNLASEKQQMGA--------------- 139 |||.|.|||||||||||||||||||||::|:|. .. 157 RLLQENVCTNDNIPSVSSINRVLRNLAAQKEQQSTGSGSSSTSAGNSISA 206 155 -----------SWGTR---PGWYPGTSVPGQPTQ---------------- 174 ||. .| ..||| ||:. ..|. . 307 NHQALQQHQQQSWPPRHYSGSWYP-TSLSEIPISSAPNIASVTAYASGPS 355 175 ------------------------------------DGCQQQE---GGGE 185 ||.|. .| |.|| 356 LAHSLSPPNDIESLASIGHQRNCPVATEDIHLKKELDGHQSDETGSGEGE 405 186 NTNSISSNGEDSDEAQMRLQLKRKLQRNRTSFTQEQIEALEKEFERTHYP 235 |:|. .:||. .::::.|.||.|||||||||||||.:||::||||||||||| 406 NSNGGASNIGNTEDDQARLILKRKLQRNRTSFTNDQIDSLEKEFERTHYP 455 PAX6_HUMAN aligned against PAX6_DRO
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 Sequence alignment AGGCTATCACCTGACCTCCAGGCCGATGCCC TAGCTATCACGACCGCGGTCGATTTGCCCGAC - AG G CTATCAC CT GACC T C CA GG C CGA -- TGCCC --- T AG - CTATCAC -- GACC G C -- GG T CGA TT TGCCC GAC Defnition Given two strings v = v 1 v 2 ...v m , w = w 1 w 2 …w n , an alignment is an assignment of gaps to positions 0,…,m in v , and 0,…,n in w , so as to line up each letter in one sequence with either a letter, or a gap in the other sequence Mutations at the DNA level …AC GGTG CAGT T ACCA… …AC ---- CAGT C CACCA… Substitution SEQUENCE EDITS REARRANGEMENTS Deletion Inversion Translocation Duplication Scoring an alignment A simple scoring scheme: Penalize mismatches by μ Penalize indels by σ , Reward matches with +1 Resulting score: #matches – ( #mismatches) ( #indels) σ Objective: Fnd the best scoring alignment Number of pairwise alignments Given sequences of length m and n, the number of alignments is: min( m,n ) ° k =0 ± m k ²± n k ² = ± n + m n ² Number of pairwise alignments ±or two sequences of length n: Derived using Stirling’s approximation: 2 n n ± = (2 n )!
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 11

alignment - Homology Sequence alignment how to discover...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online