slides-alignment-1

slides-alignment-1 - String Alignment I Computational...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: String Alignment I Computational Biology, Department Informatik ETH Zentrum Computational Biology p.1/26 O r g a n i z a t i o n What is an alignment? How do we score an alignment? Model (Markovian) Scoring function (Mutation Matrices) What algorithm do we use to align? Dynamic Programming Review of Last Week Proteins- strings with alphabet of 20 amino acids ACDEFGHIKLMNPQRSTVWY DNA- strings with alphabet over 4 nucleotide bases ACGTU DNA codes for protein 3 DNA nucleotides together make a "codon" which codes for 1 amino acid Genetic code is 64 codons translated into amino acids coding DNA has a "reading frame" - the correct place to start reading 3 bases GCGTGTAAATGA \\ <A><Y><K><stop> \\ <R><V><N> \\ <V><s><M> \\ Computational Biology p.2/26 G e n e t i c C o d e GGG G Gly AGG R Arg CGG R Arg UGG W Trp GGA G Gly AGA R Arg CGA R Arg UGA Stop GGC G Gly AGC S Ser CGC R Arg UGC C Cys GGU G Gly AGU S Ser CGU R Arg UGU C Cys GAG E Glu AAG K Lys CAG Q Gln UAG Stop GAA E Glu AAA K Lys CAA Q Gln UAA Stop GAC D Asp AAC N Asn CAC H His UAC Y Tyr GAU D Asp AAU N Asn CAU H His UAU Y Tyr GCG A Ala ACG T Thr CCG P Pro UCG S Ser GCA A Ala ACA T Thr CCA P Pro UCA S Ser GCC A Ala ACC T Thr CCC P Pro UCC S Ser GCU A Ala ACU T Thr CCU P Pro UCU S Ser GUG V Val AUG M Met CUG L Leu UUG L Leu GUA V Val AUA I Ile CUA L Leu UUA L Leu GUC V Val AUC I Ile CUC L Leu UUC F Phe GUU V Val AUU I Ile CUU L Leu UUU F Phe Computational Biology p.3/26 What is an alignment? [Rice, Mosquito] triosephosphate isomerase lengths=55,53 simil=117.7, PAM_dist=117.766, offsets=-936190011,-936189 5 identity=36.4%, similarity=18.2% NGTTDQVDKIVKILNEGQIASTDVVEVVVSPPYVFLPVVKSQLRPEIQVAAQNCW ||....!..!.|!|..|.!.:. .||||. | .!|.:.!|||...! ||||||! NGDKASIADLCKVLTTGPLNAD__TEVVVGCPAPYLTLARSQLPDSVCVAAQNCY represents an evolutionary relationship aligns amino acids that diverged from the same residue in ancestor shows accepted substitutions since divergence proteins evolve under functional constraints - destroy function =? death "correct" alignment represents actual events- substitutions, indels impossible to verify -> take alignment with the highest probability that the alignment is correct under our model Computational Biology p.4/26 T y p e s o f A l i g n m e n t s Protein-Protein (above) DNA/protein - align all 6 (3 forward, 3 backward) reading frames against protein must look for gaps and frameshifts within codons - complicated - done by Lukas Knecht DNA/DNA alignments coding and reading frame known- influence of 3 bases is not the same, align with codons coding and reading frame unknown- can be done but very complicated, must look at all reading frames for each sequence, allowing gaps within a codon...
View Full Document

Page1 / 26

slides-alignment-1 - String Alignment I Computational...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online