Lect6 Protein sequence analysis

An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Fa 06 CSE182 CSE182-L6 Protein sequence analysis Fa 06 CSE182 Possible domain queries Case 1: You have a collection of sequences that belong to a family (contain a functional domain). Given an orphan sequence, does it belong to the family? There are different solutions depending upon the representation of the domain (patterns/alignments/HMM/profiles) Case 2: You have an orphan sequence from an uncharacterized family. Can you identify other members of the family, and create a representation of them (Harder problem). Fa 06 CSE182 EX: Innexins The Macagno lab is studying Gap junction proteins, Innexins (invertebrate analogs of connexins) in Hirudo Innexins have been found in C. elegans, and Drosophila. In C. elegans, 25 members of this family have been found, and partially categorized. QuickTime Ü and a TIFF (LZW) decompressor are needed to see this picture. Fa 06 CSE182 Innexins in Hirudo When certain Innexins are knocked out, they cause serious defects in cells in the ganglia. The EST database (partial gene sequences) contains a number of putative Innexins, discovered via BLAST. Project: Q: Can you confirm that these are Innexins. Can you find more members? (this lecture) Q: Can you characterize them w.r.t known innexins in C. elegans, and Drosophila? Q: Use your method for other families of interest. Netrins, and their receptors. Fa 06 CSE182 Not all features(residues) are important Skin patterns Facial Features Fa 06 CSE182 Protein sequence motifs Premise: The sequence of a protein sequence gives clues about its structure and function. Not all residues are equally important in determining function. Suppose we knew the key residues of a family. If our query matches in those residues, it is a member. Otherwise, it is not. The key residues can be identified if we had structural information, or through conserved residues in an alignment of the family. Fa 06 CSE182 Representation of domains/families. We will consider a number of representations that describe key residues, characteristic of a family Patterns (regular expressions) Alignments Profiles HMMs Start with the following: A collection of sequences with the same function. Region/residues known to be significant for maintaining structure and function. Develop a pattern of conserved residues around the residues of interest Iterate for appropriate sensitivity and specificity Fa 06 CSE182 From alignment to patterns * ALRDFATHDDF SMTAEATHDSI ECDQAATHEAS ATH-[DE] Search a database with the resulting pattern Refine pattern to eliminate false positives Iterate Fa 06...
View Full Document

This note was uploaded on 02/14/2008 for the course CSE 182 taught by Professor Bafna during the Fall '06 term at UCSD.

Page1 / 53

Lect6 Protein sequence analysis - Fa 06 CSE182 CSE182-L6...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online