clustering-expression

clustering-expression - Clustering Gene Expression Data...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
Clustering Gene Expression Data BMI/CS 576 www.biostat.wisc.edu/bmi576/ Mark Craven craven@biostat.wisc.edu Fall 2011 Gene expression profiles we’ll assume we have a 2D matrix of gene expression measurements – rows represent genes – columns represent different experiments, time points, individuals etc. we’ll refer to individual rows or columns as profiles – a row is a profile for a gene – a column is a profile for an experiment, time point, etc.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Expression profile example rows represent genes columns represent people with leukemia rows represent yeast genes columns represent time points in a given experiment
Background image of page 2
Task definition: clustering gene expression profiles given: expression profiles for a set of genes or experiments/individuals/time points (whatever columns represent) do: organize profiles into clusters such that – profiles in the same cluster are highly similar to each other – profiles from different clusters have low similarity to each other figure from: Hack et al. Genome Biology 6(13), 2005 Clustering example pre-adipocyte (fat) cell development over 14-day time course clustering of 780 genes that are > 2- fold upregulated or downregulated at ! 4 time points
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Motivation for clustering exploratory data analysis – understanding general characteristics of data – visualizing data generalization – infer something about an object (e.g. a gene) based on how it relates to other objects in the cluster everyone else is doing it The clustering landscape there are many different clustering algorithms they differ along several dimensions – hierarchical vs. flat – hard (no uncertainty about which profiles belong to a cluster) vs. soft clusters – non-partitional (a profile can belong to multiple clusters) vs. partitional – deterministic (same clusters produced every time for a given data set) vs. stochastic – distance (similarity) measure used
Background image of page 4
Distance/similarity measures many clustering methods employ a distance (similarity) measure to assess the distance between – a pair of profiles – a cluster and a profile – a pair of clusters given a distance value, it is straightforward to convert it into a similarity value not necessarily straightforward to go the other way we’ll describe our algorithms in terms of distances dist ( x , y ) = exp( " a # sim ( x , y )) sim ( x , y ) = 1 1 + dist ( x , y ) Distance metrics properties of metrics some distance metrics dist( , ) dist( , ) i j j i x x x x = , , dist( , ) i j i e j e e x x x x = ! " dist( , ) 0 i j x x ! dist( , ) 0 i i x x = dist( , ) dist( , ) dist( , ) i j i k k j x x x x x x ! + ( ) 2 , , dist( , ) i j i e j e e x x x x = ! " Manhattan Euclidean e ranges over the individual measurements for x i and x j
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
K -means clustering assume our profiles are represented by vectors of real values put k cluster centers in same space as profiles each cluster is represented by a vector consider an example in which our vectors have 2 dimensions + + + + profile cluster center !
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/15/2011 for the course BMI 576 taught by Professor Staff during the Fall '11 term at Wisc Green Bay.

Page1 / 26

clustering-expression - Clustering Gene Expression Data...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online