Lecture8 - Lecture 8: Patterns, Profiles, and Motifs...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 8: Patterns, Profiles, and Motifs Finding patterns in protein and DNA sequences Calculating profiles of DNA sequences Some slides adapted from slides from Dr. Keith Dunker Some slides adapted from slides created by Dr. Zhiping Weng (Boston University) Definitions and Resources Motif : A region of a protein or DNA sequence that may be functionally or structurally significant and/or conserved in other sequences Motifs usually contain biologically important sequences Pattern : Describes a motif using a qualitative consensus sequence (e.g., IUPAC or regular expression) Profile : Describes a motif using quantitative information captured in a position specific scoring matrix (weight matrix) PROSITE is a protein sequence pattern and profile database http://www.expasy.ch/prosite Contains >1100 entries describing >1600 patterns and profiles DNA pattern and profile databases are more fragmented JASPAR (http://jaspar.genereg.net/) and S. cerevisiae Promoter Database (SCPD) (http://rulai.cshl.edu/SCPD/) Importance of Sequence Patterns in Proteins Conserved patterns in protein sequences usually have important biological functions Conserved sequence patterns may be indicative of e.g., a protein structural domain, enzyme active site, or a binding site for another protein or metal ion Cu,Zn Superoxide Dismutase Image created using Cn3D from pdb id: 1XSO Steps in the Development of a New PROSITE Pattern (1) Construct a multiple sequence alignment of a protein family (2) Use the alignment to identify conserved or biologically significant residues (e.g., residues in catalytic/active site, binding domain, structural features) (3) Start by creating a core sequence pattern (approximately 4-5 contiguous amino acids in length) (4) Expand the pattern to improve its sensitivity and specificity for detecting known protein family members Sensitivity : Test the trial pattern against known positive sequences Specificity : Test the trial pattern against known negative sequences Information from http://ca.expasy.org/prosite/prosuser.html Finding Patterns in Multiple Sequence Alignments Baeyer-Villiger monooxygenases (BVMOs)- taken from Fraaije, et al (2002) FEBS Letters 518 :43-47 Pattern of Cu/Zn Superoxide Dismutase Example of a PROSITE pattern ( PS00087; SOD_CU_ZN_1 ): [ GA ]-[ IMFAT ]- H-[ LIVF ]- H-{S}-x-[ GP ]-[ SDG ]-x-[ STAGDE ] The two histidines (H) are copper ligands...
View Full Document

Page1 / 5

Lecture8 - Lecture 8: Patterns, Profiles, and Motifs...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online