LecturesPart13

LecturesPart13 - Computational Biology, Part 13 Retrieving...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Robert Copyright © 1996, 1999, 2001-2006. Copyright All rights reserved. 1 Retrieving 3D structures s Protein Data Bank (PDB) x home page = http://www.rcsb.org/pdb/ s NCBI x via Structure Database s BLAST x via links following sequence similarity searches 2 Displaying Structures s s Most web page displays relating to sequences are Most two-dimensional and easily interpreted by visual inspection. Appreciating molecular structures requires viewing them from various directions and modifying the display to emphasize different portions of the molecule portions To display 3D structures locally, we can use To programs such as Cn3D or RasMol, public or public domain programs available for wide range of computers, including MacOS, Windows and Unix computers, 3 PDB files s In order to optimally display, rotate and In color the 3D structure, we need to download a copy of the coordinates for each atom in the molecule to our local computer the s The most common format for storage and The exchange of atomic coordinates for biological molecules is PDB file format PDB 4 PDB files s PDB file format is a text (ASCII) format, PDB with an extensive header that can be read and interpreted either by programs or by people people s We can request either the header only or the We entire file entire 5 Example PDB file HEADER COMPND SOURCE AUTHOR REVDAT REVDAT JRNL JRNL JRNL JRNL JRNL JRNL REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK SYNTHETIC PROTEIN MODEL 02-JUL-90 1AL1 ALPHA - 1 (AMPHIPHILIC ALPHA HELIX) SYNTHETIC C.P.HILL,D.H.ANDERSON,L.WESSON,W.F.DE*GRADO,D.EISENBERG 2 15-JAN-95 1AL1A 1 HET 1 15-OCT-91 1AL1 0 AUTH C.P.HILL,D.H.ANDERSON,L.WESSON,W.F.DE*GRADO, AUTH 2 D.EISENBERG TITL CRYSTAL STRUCTURE OF ALPHA=1=: IMPLICATIONS FOR TITL 2 PROTEIN DESIGN REF SCIENCE V. 249 543 1990 REFN ASTM SCIEAS US ISSN 0036-8075 038 1 1 REFERENCE 1 1 AUTH D.EISENBERG,W.WILCOX,S.M.ESHITA,P.M.PRYCIAK,S.P.HO 1 TITL THE DESIGN, SYNTHESIS, AND CRYSTALLIZATION OF AN 1 TITL 2 ALPHA-*HELICAL PEPTIDE 1 REF PROTEINS.STRUCT.,FUNCT., V. 1 16 1986 1 REF 2 GENET. 1 REFN ASTM PSFGEY US ISSN 0887-3585 867 2 2 RESOLUTION. 2.7 ANGSTROMS. 3 3 REFINEMENT. BY THE RESTRAINED LEAST SQUARES PROCEDURE OF J. 3 KONNERT AND W. HENDRICKSON (PROGRAM *PROLSQ*). THE R 3 VALUE IS 0.255 FOR ALL DATA. THE R VALUE IS 0.211 FOR ALL 3 REFLECTIONS IN THE RESOLUTION RANGE 10.0 TO 2.7 ANGSTROMS 3 WITH FOBS .GT. 2*SIGMA(FOBS). THE RMS DEVIATION FROM 3 IDEALITY OF THE BOND LENGTHS IS 0.013 ANGSTROMS. THE RMS 3 DEVIATION FROM IDEALITY OF THE BOND ANGLE DISTANCES IS 1AL1 1AL1 1AL1 1AL1 1AL1A 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 1AL1 2 3 4 5 1 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 6 Example PDB file SEQRES 1 13 ACE GLU LEU LEU LYS LYS LEU LEU GLU GLU LEU LYS GLY 1AL1 39 HET SO4 13 5 SULFATE ION 1AL1A 5 FORMUL 2 SO4 O4 S1 1AL1 41 HELIX 1 HL1 ACE 0 LEU 10 1 1AL1 42 CRYST1 62.350 62.350 62.350 90.00 90.00 90.00 I 41 3 2 48 1AL1 43 ORIGX1 1.000000 0.000000 0.000000 0.00000 1AL1 44 ORIGX2 0.000000 1.000000 0.000000 0.00000 1AL1 45 ORIGX3 0.000000 0.000000 1.000000 0.00000 1AL1 46 SCALE1 0.016038 0.000000 0.000000 0.00000 1AL1 47 SCALE2 0.000000 0.016038 0.000000 0.00000 1AL1 48 SCALE3 0.000000 0.000000 0.016038 0.00000 1AL1 49 ATOM 1C ACE 0 31.227 38.585 11.521 1.00 25.00 1AL1 50 ATOM 2O ACE 0 30.433 37.878 10.859 1.00 25.00 1AL1 51 ATOM 3 CH3 ACE 0 30.894 39.978 11.951 1.00 25.00 1AL1 52 ATOM 4N GLU 1 32.153 37.943 12.252 1.00 25.00 1AL1 53 ATOM 5 CA GLU 1 32.594 36.639 11.811 1.00 25.00 1AL1 54 ATOM 6C GLU 1 32.002 35.428 12.514 1.00 25.00 1AL1 55 ATOM 7O GLU 1 32.521 34.279 12.454 1.00 25.00 1AL1 56 ATOM 8 CB GLU 1 34.093 36.609 11.812 1.00 25.00 1AL1 57 … ATOM 102 OXT GLY 12 20.888 27.022 1.650 1.00 25.00 1AL1 144 TER 103 GLY 12 1AL1 145 HETATM 104 S SO4 13 31.477 38.950 15.821 0.50 25.00 1AL1 146 HETATM 105 O1 SO4 13 31.243 38.502 17.238 0.50 25.00 1AL1 147 HETATM 106 O2 SO4 13 30.616 40.133 15.527 0.50 25.00 1AL1 148 HETATM 107 O3 SO4 13 31.158 37.816 14.905 0.50 25.00 1AL1 149 HETATM 108 O4 SO4 13 32.916 39.343 15.640 0.50 25.00 1AL1 150 CONECT 104 105 106 107 108 1AL1 151 CONECT 105 104 1AL1 152 CONECT 106 104 1AL1 153 CONECT 107 104 1AL1 154 CONECT 108 104 1AL1 155 MASTER 29 0 1 1 0 0 0 6 100 1 5 1 1AL1A 6 END 1AL1 157 7 Cn3D format s Cn3D uses a special format that combines Cn3D the atomic coordinates with sequence information information s It can also show more than one structure It superimposed superimposed s It is a binary format so difficult to view It directly (e.g., via text editor) directly 8 Example structure retrieval session s (Use Entrez Structure database to retrieve (Use and view structure of 1HOC, MHC class I using Cn3D) using s (Cross to PDB link) s (Download PDB file and view it with (Download Rasmol) Rasmol) 9 10 Useful RasMol commands s show sequence lists all amino acids in each show chain chain s select *a selects all residues in chain A select selects s colour red displays the selected residues in colour red red 11 12 13 3HHB - all alpha s Display: Display: ribbons ribbons s Color: group 14 1CD8 - all beta s Display: Display: cartoons cartoons 15 1KFJ - alpha/beta s Display: Display: cartoons cartoons s Select *a s Colour violet s Select *b s Colour yellow 16 1AL1 - Amphiphilic Alpha Helix s s s s s s s s select all colour white ribbons select charged select and not backbone backbone wireframe colour red select select hydrophobic and not backbone backbone colour blue 17 1AL1 - Amphiphilic Alpha Helix s select all s spacefill 18 Structural homology s It is useful for new proteins whose 3D It structure is not known to be able to find not proteins whose 3D structure is known that is are expected to have a similar structure to the unknown the s It is also useful for proteins whose 3D It structure is known to be able to find other proteins with similar structures proteins 19 Finding proteins with known structures based on sequence homology s If you want to find known 3D structures of If proteins that are similar in primary amino acid sequence to a particular sequence, can use BLAST web page and choose the PDB BLAST database database s This is not the PDB database of structures, This not rather a database of amino acid sequences for those proteins in the structure database for s Links are available to retrieve PDB files 20 Finding proteins with similar structures to a known protein s For literature and sequence databases, For literature sequence Entrez allows neighbors to be found for a Entrez neighbors selected entry based on “homology” in terms (MEDline database) or sequence (protein and nucleic acid sequence databases) databases) s Entrez also allows neighbors to be chosen Entrez for entries in the structure database 21 Finding proteins with similar structures to a known protein s Proteins with similar structures are termed Proteins “VAST Neighbors” or “related related structures” by Entrez (VAST refers to the structures by Entrez method used to evaluate similarity of structure) structure) s VAST or structure neighbors may or may not have sequence homology to each other not 22 Finding proteins with similar structures to tubulin 23 Finding proteins with structures similar to tubulin 24 Finding proteins with structures similar to tubulin 25 Finding proteins with similar structures to tubulin 26 ...
View Full Document

Ask a homework question - tutors are online