11Gene Set Testing (Continued)Peng Liu4/22/20082Procedures of SAFE1.Calculate a “local” statistics for each gene to measure the significance. Such statistics can be an ordinary t-statistic for comparing two treatments.2.Calculate a “global” statistic for the gene set to be tested. The global statistic measures the difference between the local statistics in the interesting gene set and the local statistics in the complement of the gene set.3Procedures of SAFE3.Use permutation to assess the significance of the global statistic for the gene set.Permute the labels of the treatment and calculate the global statistic for permuted data. Compare the observed global statistic with the global statistics from the permuted data and calculate the p-values.4.Estimate the multiple testing error (FWER or FDR) using the p-values for the observed data and permuted data.4Global statistics in SAFEThe global statistics assesses how the distribution of local statistics within a category differs from local statistics outside the category.SAFE procedure chooses rank-invariant choices for global statistics, such as the Wilcoxon rank sum.5Wilcoxon rank sumTo calculate the Wilcoxon rank sum statistic, we first rank all genes according to the local statistics.Then Wilcoxon rank sum is calculated as:where Wis the sum of ranks for genes in the interesting gene set, Nis the total number of genes and gis the number of genes in the interesting gene set.12/)1)((2/)1(+−+−=NgNgNgWZW6GSEAGSEA is similar to SAFE in that it also follows the same procedure:calculate some statistics for individual genes and rank accordingly,calculate an enrichment score for the gene setuse permutation to estimate the significanceestimate multiple testing error
has intentionally blurred sections.
Sign up to view the full version.