{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

LecturesPart24 - Computational Biology Part 24 Clustering...

Info icon This preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Computational Biology, Part 24 Clustering and Unmixing of Subcellular Patterns Robert F. Murphy Copyright © 1996, 1999, 2000-2009. All rights reserved. Unsupervised Learning to Identify High-Resolution Protein Patterns Location Proteomics s Tag many proteins x cDNA tagging 3 3 Put individual cDNAs into GFP tagging vector (puts GFP coding at end) Transfect individual clones with each tagged cDNA Infect population of cells with a retrovirus carrying DNA sequence that will “tag” in a random gene in each cell Isolate separate clones, each of which produces express one tagged protein Use RT-PCR to identify tagged gene in each clone x CD-tagging (developed by Jonathan Jarvik and Peter Berget): 3 3 3 s Collect many live cell images for each clone using spinning disk confocal fluorescence microscopy or automated high-throughput microscopy Images of CD-tagged 3T3 cells Chen et al 2003; Chen and Murphy 2005 s s s SLF features can be used to measure similarity of protein patterns This allows us for the first time to create a systematic, objective, framework for describing subcellular locations: a Subcellular Location Tree Start by grouping two proteins whose patterns are most similar, keep adding branches for less and less similar patterns http://murphylab.web.cmu.edu/services/PSLID/tree.html Protein name Human description From databases Nucleolar Proteins Punctate Nuclear Proteins Predominantly Nuclear Proteins with Some Punctate Cytoplasmic Staining Nuclear and Cytoplasmic Proteins with Some Punctate Staining Uniform http://murphylab.web.cmu.edu/services/PSLID/tree.html Protein name Bottom: Visual Assignment to “known” locations Top: Automated Grouping and Assignment Decomposing (unmixing) complex patterns Decomposing mixture patterns s Clustering or classifying whole cell patterns will consider each combination of two or more “basic” patterns as a unique new pattern s Desirable to have a way to decompose mixtures instead s One approach would be to assume that each basic pattern has a recognizable combination of different types of objects Object type determination s Rather than specifying object types, we can choose to learn them from the data s Use subset of SLFs to describe objects s Perform k-means clustering for k from 2 to 40 s Evaluate goodness of clustering using Akaike Information Criterion s Choose k that gives lowest AIC Cluster Number Selection s Akaike Information Criterion (AIC) = 2k – 2ln(L) s k=number of clusters s L=likelihood of model given data 16 Example of Object Types Type A Type B Type C Type D 17 Unmixing: Learning strategy s Once object types are known, each cell in the training (pure) set can be represented as a vector of the amount of fluorescence for each object type s Learn probability model for these vectors for each class s Mixed images can then be represented using mixture fractions times the probability distribution of objects for each class 0.5 0.4 0.3 Amt fluor. 0.2 0.1 0 1 2 3 4 Golgi class Lysosomal class 5 6 7 Nuclear class 8 Pure Lysosomal Pattern Object type 0.5 0.4 0.3 Amt fluor. 0.2 0.1 0 1 2 3 4 Golgi class Lysosomal class 5 6 7 Nuclear class 8 0.25 Pure Golgi Pattern Object type 50% mix of each Amt fluor. 0.2 0.15 0.1 0.05 0 1 2 3 4 5 All Golgi class Lysosomal class Nuclear class 7 8 6 Object type Two-stage Strategy for unmixing unknown image s Find objects in unknown (test) image, classify each object into one of the object types using learned object type classifier built with all objects from training images s For each test image, make list of how often each object type is found s Find the fractions of each class that give “best” match to this list Test samples s How do we test a subcellular pattern unmixing algorithm? s Need images of known mixtures of pure patterns – difficult to obtain “naturally” s Created test set by mixing different proportions of two probes that localize to different cell parts (lysosomes and mitochondria) Tao Peng, Ghislain Bonamy, Estelle Glory, Sumit Chanda, Dan Rines (Genome Research Institute of Novartis Foundation) s Lysotracker s Mitotracker s Mixture of Lysotracker and Mitotracker Pattern unmixing results 25 ...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern