MIT6_047f08_lec05_slide05

MIT6_047f08_lec05_slide05 - MIT OpenCourseWare...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms .
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Classification Lecture 5 September 18, 2008 Computational Biology: Genomes, Networks, Evolution
Background image of page 2
Two Different Approaches • Generative – Bayesian Classification and Naïve Bayes – Example: Mitochondrial Protein Prediction • Discriminative – Support Vector Machines – Example: Tumor Classification
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Bayesian Classification We will pose the classification problem in probabilistic terms Create models for how features are distributed for objects of different classes We will use probability calculus to make classification decisions
Background image of page 4
Classifying Mitochondrial Proteins Derive 7 features for all human proteins Predict nuclear encoded mitochondrial genes Maestro Targeting signal Protein domains Mass Spec Co-expression Homology Induction Motifs First page of article removed due to copyright restrictions: Calvo, S., et al. "Systematic Identification of Human Mitochondrial Disease Genes Through Integrative Genomics." Nature Genetics 38 (2006): 576-582.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lets Look at Just One Feature • Each object can be associated with multiple features • We will look at the case of just one feature for now We are going to define two key Co-Expression Conservation Proteins
Background image of page 6
The First Key Concept Features for each class drawn from class-conditional probability distributions (CCPD) P(X| Class1 ) P(X| Class2 ) Our first goal will be to model these distributions X
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
The Second Key Concept We model prior probabilities to quantify the expected a priori chance of seeing a class P( mito ) = how likely is the next protein to be a mitochondrial protein before I see any features to help me decide We expect ~1500 mitochondrial genes out of ~21000 total, so P(mito)=1500/21000 P(~mito)=19500/21000 P( Class2 ) & P( Class1 )
Background image of page 8
But How Do We Classify? So we have priors defining the a priori probability of a class We also have models for the probability of a feature given each class But we want the probability of the class given a feature How do we get P(Class1|X) ? P(Class1), P(Class2) P(X|Class1), P(X|Class2)
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Bayes Rule ( | )( ) (| ) () P Feature Class P Class PC lass Fea ture PFea tu re = Belief before evidence Evaluate evidence Evidence Belief after evidence Bayes, Thomas (1763) An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53:370-418
Background image of page 10
Bayes Decision Rule If we observe an object with feature X, how do decide if the object is from Class 1? The Bayes Decision Rule is simply choose Class1 if: ( 1 |) ( 2 (| 1 ) ( 1 ) 2 ) ( 2 ) ( 1 ) ( 1 ) 2 ) ( )) ) ( 2 P Class X P Class X PX C la s s PL s s P X Class P Class P X PL PX Class P Class > > > This is the same number on both sides!
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Discriminant Function We can create a convenient representation of the Bayes Decision Rule If G(X) > 0, we classify as Class 1 ( | 1) ( 1) ( | 2) ( 2) (| 1 ) ( 1 ) 1 2 ) ( 2 ) 1 ) ( 1 ) () l o g 0 2 )( 2 ) P X Class P Class P X Class P Class P X Class P Class PX C la s s PC s s P X Class P Class GX P X Class P Class > > =>
Background image of page 12
Stepping back What do we have so far?
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 14
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 44

MIT6_047f08_lec05_slide05 - MIT OpenCourseWare...

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online