LecturesPart24

LecturesPart24 - Computational Biology, Part 24 Automated...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Computational Biology, Part 24 Automated Interpretation of Subcellular Patterns in Microscope Images I Robert F. Murphy Copyright © 1996, 1999, 2000-2006. Copyright All rights reserved. Introduction to Cell and Molecular Biology of Protein Location Localization motifs: contiguous in sequence or structure? Open questions s How many distinct locations can proteins be How found in? What are they? found s How many distinct motifs direct proteins to How those locations? What are they? those The Omics Revolution: A new paradigm for biology s The paradigm for biological research for The over fifty years was the intensive study by an individual investigator (and his/her students) of a single enzyme, gene, or process, often in a single model system process, x Ion pumping mechanism of Na+/K+-ATPase x Transcriptional regulation of bicoid Transcriptional bicoid x Endocytosis in intestinal cells The Omics Revolution: A new paradigm for biology s The success of genome sequencing projects The suggested a new “omics” paradigm: analysis of a single property or phenomenon across an entire genome, transcriptome, proteome, etc. proteome, x Identification of all genes in a given genome x Identification of all expressed proteins in a Identification given cell type given The Omics Revolution: A new paradigm for biology s Given the large number of proteins, genes, Given etc. “omics” project usually require highetc. throughput data collection and automated throughput data analysis data s Key to progress is Key x identification of a new aspect that needs to be identification analyzed “ome-wide” and analyzed x development of assays combined with analysis development approaches Proteomics s The set of proteins expressed in a given cell The type or tissue is called its proteome proteome s Proteomics projects x sequence x structure x activity x partners x location Location information in protein databases: Traditional approach s conduct experiments of various types x x x s Cell fractionation Electron microscopy Fluorescence microscopy describe the results in unstructured text (first in describe journal articles and then in summaries in databases) databases) x “Protein X is located primarily in protrusions from the Protein early endosomal membrane but is also found in the plasma membrane” plasma Location information in protein databases: Ontology approach s Systematic analysis and comparison of Systematic these descriptions were made difficult by both the unstructured nature of the text and the variation in terminology used from one laboratory to another laboratory s To address this problem, a restricted To vocabulary for cellular components was created by the Gene Ontology consortium Gene Restricted Vocabulary Approaches Restricted Vocabulary Approaches Restricted Vocabulary Approaches Use of GO terms s Databases such as SwissProt use manual Databases curation to assign GO terms to proteins based on reading of relevant literature based s A major problem is consistency of major application of terms application Example comparison of GO terms for two proteins ID ID AC GN DR DR DR DR DR DR GIAN_HUMAN STANDARD; PRT; 3259 AA. Q14789; Q14398; GOLGB1. GO; GO:0000139; C:Golgi membrane; TAS. GO:0000139 GO; GO:0005795; C:Golgi stack; TAS. GO:0005795 GO; GO:0016021; C:integral to membrane; TAS. GO:0016021 ID AC GN DR DR DR DR DR DR DR DR O00461 PRELIMINARY; PRT; 696 AA. O00461; GPP130. GO; GO:0005810; C:endocytotic transport vesicle; TAS. GO:0005810 GO; GO:0005801; C:Golgi cis-face; TAS. GO:0005801 GO; GO:0005796; C:Golgi lumen; TAS. GO:0005796 GO; GO:0016021; C:integral to membrane; TAS. GO:0016021 Words are not enough s We learned that Giantin and GPP130 are We both Golgi proteins, but do we know: both x What part (i.e., cis, medial, trans) of the Golgi What complex they each are found in? x If they have the same subcellular distribution? x If they also are found in other compartments? Conclusion s Current knowledge of subcellular locations Current of proteins is not sufficiently detailed or systematic systematic s Systematic description of subcellular Systematic locations should be created using a datadatadriven approach rather than a knowledge-capture approach knowledge-capture Determining protein location s The primary method used to determine the The determine subcellular location of a protein is to “tag” it with a fluorescent probe and then image its distribution within cells using fluorescence microscopy microscopy Tagging proteins for fluorescence microscopy s Immunofluorescence x x x s “primary” antibody against the target, primary” “secondary” antibody against the “primary” and secondary” conjugated with a fluorescent probe conjugated Fixed-cells only GFP-tagging x x merge DNA coding for a naturally fluorescent protein merge with coding sequence of a protein of interest with Live-cell possible Tagging proteins for fluorescence microscopy s GFP-tagging x Can create fusion between GFP and a cDNA, in Can which case all regulatory sequences that control expression of the corresponding protein is lost expression x Can create fusion between GFP and the Can genomic sequence of a gene, in which case regulatory sequences preserved regulatory 3 Example: CD-tagging Principles of CD-Tagging (Jarvik & Berget) (CD = Central Dogma) Exon 1 Exon 1 Intron 1 Exon 2 Tag CD cassette Tag Exon 2 Genomic DNA + CD-cassette Tagged DNA Tagged mRNA Tag Tag (Epitope) Tagged Protein Automated Interpretation s Traditional analysis of fluorescence Traditional microscope images has occurred by visual inspection inspection s Our goal has to been automate the Our interpretation, to yield better interpretation, x Objectivity x Sensitivity x Reproducibility Initial Goal This is a microtubule pattern Assign proteins to major subcellular structures using fluorescent microscopy The Challenge ®Classification by direct (pixel-by-pixel) comparison of Classification individual images to known patterns is not useful, since individual ®different cells have different shapes, sizes, orientations different shapes, ®organelles within cells are not found in fixed locations organelles not ®Therefore, use feature-based approach Successful Classification and Clustering s Murphy group has demonstrated Murphy classification of ten subcellular patterns in 2D and 3D images of HeLa cells with accuracy on single cells of 92% and 98% accuracy, respectively accuracy, s Have also clustered 90 randomly-tagged Have proteins into 17 statistically-distinct patterns in 3T3 cells in Microscope Datasets for Subcellular Location s We have collected four datasets of We fluorescence microscope images depicting the subcellular location patterns of a number of proteins in three different cell lines of s Available at http://murphylab.web.cmu.edu Microscope Datasets for Subcellular Location s 2D Chinese hamster ovary cells x Widefield microscopy with numerical Widefield deconvolution (100x) deconvolution x 5 different probes (classes) x 1 color x Pixel size = 0.23 µm x 0.23 µm Pixel 0.23 x ~80 cell images per class Example Images: 2D CHO s s Single color staining Single for specific protein for Three 2D slices Three acquired and numerically deconvolved to yield one in focus 2D slice slice Microscope Datasets for Subcellular Location s 2D HeLa x Widefield microscopy with numerical Widefield deconvolution (100x) deconvolution x 9 different antibodies plus a DNA stain x 2 colors per image x Pixel size = 0.23 µm x 0.23 µm Pixel 0.23 x ~80 cell images per class Example Images: 2D HeLa ER giantin gpp130 s s LAMP Mito Nucleolin s Actin TfR Tubulin Red=DNA, Red=DNA, Green=specific Green=specific Three 2D slices Three acquired & numerically deconvolved to yield one in focus 2D slice focus Red and Green Red semisemiautomatically automatically registered registered DNA Microscope Datasets for Subcellular Location s 3D HeLa x Confocal Microscope (100x) x 9 different antibodies plus DNA stain and total different protein stain protein x 3 colors per image x Voxel size = 0.049 µm x 0.049 µm x 0.2 µm Voxel 0.049 0.2 x ~50 cell images per class Example Image: 3D HeLa s s Red=DNA, Blue=Total Protein, Green=specific protein Acquired as stack of 2D slices by changing focal position Acquired stack Projections Projections Single Slice Example Images: 3D HeLa Nuclear ER Giantin gpp130 Lysosomal Mitoch. Nucleolar Actin Endosomal Tubulin 3D HeLa s 2D slices 2D (from bottom to top) for cell labeled for transferrin receptor receptor (primarily in endosomes) endosomes) QuickTimeª and a decompressor are needed to see this picture. 3D HeLa s 2D slices 2D (from bottom to top) for cell labeled for giantin giantin (primarily in Golgi) Golgi) QuickTimeª and a decompressor are needed to see this picture. 3D HeLa s 2D slices 2D (from bottom to top) for cell labeled for tubulin (major tubulin constituent of microtubules) microtubules) QuickTimeª and a decompressor are needed to see this picture. Microscope Datasets for Subcellular Location s 3D 3T3 x Spinning Disk Confocal Microscope (60x) x GFP for a specific protein x Images collected for 90 different clones x 1 color x Voxel size = 0.11 µm x 0.11 µm x 0.5 µm Voxel 0.11 0.5 x ~30 cell images per class x Also have some 2D time series images Example Images: 3D 3T3 s Thirty slices Thirty acquired by spinning disk confocal microscope microscope 3D 3T3 2D slices 2D over time for cell labeled for Glut1 Glut1 (membrane protein on surface and in vesicles) in QuickTimeª and a decompressor are needed to see this picture. ...
View Full Document

This note was uploaded on 01/13/2012 for the course BIO 101 taught by Professor Staff during the Fall '10 term at DePaul.

Ask a homework question - tutors are online