# Lecture5 - Sequence Alignment What do we mean Lecture 5...

This preview shows pages 1–4. Sign up to view the full content.

Lecture 5: Sequence Alignments and Scoring Matrices • What do we mean by sequence alignment? • Simple sequence alignment method (Dot Plot) • Dynamic programming (Needleman-Wunsch algorithm) • Scoring Matrices Some of the slides were adapted from a website developed by Dr. Przytyzca Some slides adapted from slides created by Dr. Keith Dunker Sequence Alignment: What do we mean? § Goal : to write one sequence along the other to express any similarity between the sequences • Each element of sequence is either placed alongside a corresponding element in the other sequence, or alongside a gap character § Example: CATA and GATCA can be aligned as follows: -CAT-A G-ATCA § Problems: • How do we efficiently find this alignment? • Can we find a better alignment? Global vs. Local Alignments § Global Alignment: Aims to align as many characters in each sequence as possible (typically entire sequence) § Local Alignment: Focuses on segments of sequence with the highest density of matches. • generates one or more islands of matches in the aligned sequences Sequence Alphabets § DNA Four nucleotides: A = Adenine G = Guanine C = Cytosine T = Thymine § RNA Four nucleotides: A = Adenine G = Guanine C = Cytosine U = Uracil § Protein Twenty amino acids: A = Alanine S = Serine C = Cysteine T = Threonine D = Aspartic acid V = Valine E = Glutamic acid W = Tryptophan F = Phenylalanine Y = Tyrosine G = Glycine H = Histidine I = Isoleucine K = Lysine L = Leucine M = Methionine N = Asparagine P = Proline Q = Glutamine R = Arginine § Which sequence alphabet is more difficult to align? Why?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Importance of Sequence Alignments § For DNA, RNA, and amino acid sequences, high sequence similarity usually implies significant functional or structural similarity • Implied structure from sequence similarity • Suggest function for newly discovered genes and promoters § Identify and investigate conserved nucleotides and/or amino acids in related sequences Simple Method: The Dot-Plot § Take 2 sequences and write each along one side of a 2D matrix § Every place where the sequences match, place a dot § To obtain an alignment, find long diagonal runs § An example: seq #1: ATTGCCCATG seq #2: ATGGCCATTG * * G * * * T * * * T * * A * * * C * * * C * * G * * G * * * T * * A G T A C C C G T T A * * G * * * T * * * T * * A * * * C * * * C * * G * * G * * * T * * A G T A C C C G T T A Simple Method: The Dot-Plot A Real Dot Plot Comparison § Dot plot comparison of yeast HTA1 and HTA2 genes (both code for histone H2A) § Window size = 1 (each dot represents a nucleotide match) Image generated using: http://www.vivo.colostate.edu/molkit/dnadot/ HTA1 HTA2
A Real Dot Plot Comparison § Dot plot comparison of yeast HTA1 and HTA2 genes (both code for histone H2A) § Window size = 5 (each dot represents a window of 5 nucleotides that match) Image generated using: http://www.vivo.colostate.edu/molkit/dnadot/ HTA1 HTA2 A Real Dot Plot Comparison § Human chromosome 21 centromere compared to itself § Window size = 9 (each dot represents a window of 9 nucleotides that match) Image generated using: http://www.vivo.colostate.edu/molkit/dnadot/ CEN21 CEN21 Adding a Score to Alignments

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 01/20/2012 for the course MBIOS 478 taught by Professor Staff during the Fall '11 term at Washington State University .

### Page1 / 10

Lecture5 - Sequence Alignment What do we mean Lecture 5...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online