Lecture5 - Sequence Alignment: What do we mean? Lecture 5:...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Lecture 5: Sequence Alignments and Scoring Matrices • What do we mean by sequence alignment? • Simple sequence alignment method (Dot Plot) • Dynamic programming (Needleman-Wunsch algorithm) • Scoring Matrices Some of the slides were adapted from a website developed by Dr. Przytyzca Some slides adapted from slides created by Dr. Keith Dunker Sequence Alignment: What do we mean? § Goal : to write one sequence along the other to express any similarity between the sequences • Each element of sequence is either placed alongside a corresponding element in the other sequence, or alongside a gap character § Example: CATA and GATCA can be aligned as follows: -CAT-A G-ATCA § Problems: • How do we efficiently find this alignment? • Can we find a better alignment? Global vs. Local Alignments § Global Alignment: Aims to align as many characters in each sequence as possible (typically entire sequence) § Local Alignment: Focuses on segments of sequence with the highest density of matches. • generates one or more islands of matches in the aligned sequences Sequence Alphabets § DNA Four nucleotides: A = Adenine G = Guanine C = Cytosine T = Thymine § RNA Four nucleotides: A = Adenine G = Guanine C = Cytosine U = Uracil § Protein Twenty amino acids: A = Alanine S = Serine C = Cysteine T = Threonine D = Aspartic acid V = Valine E = Glutamic acid W = Tryptophan F = Phenylalanine Y = Tyrosine G = Glycine H = Histidine I = Isoleucine K = Lysine L = Leucine M = Methionine N = Asparagine P = Proline Q = Glutamine R = Arginine § Which sequence alphabet is more difficult to align? Why?
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Importance of Sequence Alignments § For DNA, RNA, and amino acid sequences, high sequence similarity usually implies significant functional or structural similarity • Implied structure from sequence similarity • Suggest function for newly discovered genes and promoters § Identify and investigate conserved nucleotides and/or amino acids in related sequences Simple Method: The Dot-Plot § Take 2 sequences and write each along one side of a 2D matrix § Every place where the sequences match, place a dot § To obtain an alignment, find long diagonal runs § An example: seq #1: ATTGCCCATG seq #2: ATGGCCATTG * * G * * * T * * * T * * A * * * C * * * C * * G * * G * * * T * * A G T A C C C G T T A * * G * * * T * * * T * * A * * * C * * * C * * G * * G * * * T * * A G T A C C C G T T A Simple Method: The Dot-Plot A Real Dot Plot Comparison § Dot plot comparison of yeast HTA1 and HTA2 genes (both code for histone H2A) § Window size = 1 (each dot represents a nucleotide match) Image generated using: http://www.vivo.colostate.edu/molkit/dnadot/ HTA1 HTA2
Background image of page 2
A Real Dot Plot Comparison § Dot plot comparison of yeast HTA1 and HTA2 genes (both code for histone H2A) § Window size = 5 (each dot represents a window of 5 nucleotides that match) Image generated using: http://www.vivo.colostate.edu/molkit/dnadot/ HTA1 HTA2 A Real Dot Plot Comparison § Human chromosome 21 centromere compared to itself § Window size = 9 (each dot represents a window of 9 nucleotides that match) Image generated using: http://www.vivo.colostate.edu/molkit/dnadot/ CEN21 CEN21 Adding a Score to Alignments
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 10

Lecture5 - Sequence Alignment: What do we mean? Lecture 5:...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online