23%20Gene%20Set%20Tesing%204_22_08

23%20Gene%20Set%20Tesing%204_22_08 - Procedures of SAFE...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 1 Gene Set Testing (Continued) Peng Liu 4/22/2008 2 Procedures of SAFE 1. Calculate a “local” statistics for each gene to measure the significance. ± Such statistics can be an ordinary t-statistic for comparing two treatments. 2. Calculate a “global” statistic for the gene set to be tested. ± The global statistic measures the difference between the local statistics in the interesting gene set and the local statistics in the complement of the gene set. 3 Procedures of SAFE 3. Use permutation to assess the significance of the global statistic for the gene set. ± Permute the labels of the treatment and calculate the global statistic for permuted data. Compare the observed global statistic with the global statistics from the permuted data and calculate the p-values. 4. Estimate the multiple testing error (FWER or FDR) using the p-values for the observed data and permuted data. 4 Global statistics in SAFE ± The global statistics assesses how the distribution of local statistics within a category differs from local statistics outside the category. ± SAFE procedure chooses rank-invariant choices for global statistics, such as the Wilcoxon rank sum. 5 Wilcoxon rank sum ± To calculate the Wilcoxon rank sum statistic, we first rank all genes according to the local statistics. ± Then Wilcoxon rank sum is calculated as: where W is the sum of ranks for genes in the interesting gene set, N is the total number of genes and g is the number of genes in the interesting gene set. 12 / ) 1 )( ( 2 / ) 1 ( + + = N g N g N g W Z W 6 GSEA ± GSEA is similar to SAFE in that it also follows the same procedure: calculate some statistics for individual genes and rank accordingly, calculate an enrichment score for the gene set use permutation to estimate the significance estimate multiple testing error
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 7 Main Difference: global statistic ± GSEA defines an Enrichment Score which is conceptually equivalent to the global statistic in SAFE. ± Suppose we calculate the local statistics r i for all genes, i=1,…,N . ± Let N H be the number of genes in gene set S. ± Let p is a user-specified value and (p=1 recommended in their paper) 8 GSEA: enrichment score details Enrichment score for gene set S Walking down the ranked list, a running sum statistics is increased if we encounter a gene in set S and decreased if we encounter gene not in S. 9 GSEA: enrichment score details M(S 1 ) m(S 1 ) ES(S 1 )=m(S 1 ) M(S 2 ) ES(S 2 )=M(S 2 ) Walking down the ranked list, a running sum statistics is increased if we encounter a gene in set S and decreased if we encounter gene not in S. 10 Comments ± The enrichment score ES reflects the degree to which a gene set is overrepresented at the extremes. ± By permuting the treatment labels instead of
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 08/27/2009 for the course STAT 447 taught by Professor Staff during the Spring '08 term at Iowa State.

Page1 / 8

23%20Gene%20Set%20Tesing%204_22_08 - Procedures of SAFE...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online