tools_1 - 1 Bioinformatics Tools for Sequence homology and...

Info iconThis preview shows pages 1–23. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Bioinformatics Tools for Sequence homology and alignment 2 Homology Similarity between characters due to a common ancestry 3 Sequence homology Similarity between sequences that results from a common ancestor VLS P AV K WAKV G A HA AGHG VLS E AV L WAKV E A DV AGHG Basic assumption : Sequence homology similar structure/function 4 Sequence alignment Alignment: Comparing two (pairwise) or more (multiple) sequences. Searching for a series of identical or similar characters in the sequences. 5 Homology Ortholog homolog with similar function (via speciation) Paralog homolog which arose from gene duplication Orthologs 2 homologs from different species Paralogs 2 homologs within the same species Common use: 6 How close? Rule of thumb: Proteins are homologous if 25% identical ( length >100 ) DNA sequences are homologous if 70% identical 7 Twilight zone < 20% identity in proteins may be homologous and may not be. (Note that 5% identity will be obtained completely by chance!) 8 Local vs. Global Global alignment finds the best alignment across the entire two sequences. Local alignment finds regions of similarity in parts of the sequences. ADLG AVFAL CDRY F Q |||| |||| | ADLG RTQN- CDRY Y Q ADLG CDRY F Q |||| |||| | ADLG CDRY Y Q Global alignment: forces alignment in regions which differ Local alignment will return only regions of good alignment 9 When global and when local? 10 Global alignment PTK2 protein tyrosine kinase 2 of human and rhesus monkey 11 Protein tyrosine kinase domain 12 Protein tyrosine kinase domain Human PTK2 and leukocyte tyrosine kinase Both function as tyrosine kinases, in completely different contexts Ancient duplication 13 Global alignment of PTK and LTK 14 Local alignment of PTK and LTK 15 Searching databases 16 Searching a database Using a sequence as the query to find homologous sequences in the database 17 DNA or protein? For coding sequences, we can use the DNA sequence or the protein sequence to search for similar sequences. Which is preferable? 18 Protein is better! Selection (and hence conservation) works on the protein level: C TTT C A = Leu- Ser TT G A G T = Leu- Ser 19 Query type Nucleotides: a four letter alphabet Amino acids: a twenty letter alphabet Two random DNA sequences will share on average 25% of identity Two random protein sequences will share on average 5% of identity 20 Conclusions Using the amino acid sequence is preferable for homology search Why use a nucleotide sequence after all? No ORF found, e.g. newly sequenced genome No similar protein sequences were found Specific DNA databases are available (EST) 21 Some terminology Query sequence- the sequence with which we are searching Hit a sequence found in the database, suspected as homologous 22 How do we search a database?...
View Full Document

This note was uploaded on 06/12/2011 for the course CAP 5510 taught by Professor Staff during the Spring '08 term at University of Central Florida.

Page1 / 76

tools_1 - 1 Bioinformatics Tools for Sequence homology and...

This preview shows document pages 1 - 23. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online