# Lecture14 - Hypergeometric Distribution Example Lecture 14...

This preview shows pages 1–3. Sign up to view the full content.

Lecture 14: Statistics II • Multiple hypothesis testing • Wilcoxon rank sum test • Permutation tests • Introduction to DNA microarray technology Hypergeometric Distribution: Example § Used DNA microarray technology to identify genes whose expression is increased in two different cancer cell lines (prostate cancer, leukemia) § The microarray you use contains probes for 6064 human genes. § You find that 105 genes are upregulated in the prostate cancer cell line, and 180 genes are up-regulated in the leukemia cell line. § The two data sets share 64 genes in common. § Is the observed overlap (64 genes) significantly greater then the overlap expected due to random chance? § Null hypothesis: Observed overlap =< expected (random) overlap § Alternative hypothesis: Observed overlap > expected (random) overlap § Decision rule: p-value < ! = 0.001, reject null hypothesis Hypergeometric Example: Comparing Microarray Data Sets Human Microarray (6064) Leukemia (180) Prostate (105) 64 116 41 N = 6064 k = 180 n = 105 m = 64 P = 7.6 x 10 -75 P = 1 " k x # \$ % ( N " k n " x # \$ % ( N n # \$ % ( x = 0 m " 1 ) = 1 " 180 x # \$ % ( 5884 105 " x # \$ % ( 6064 105 # \$ % ( x = 0 63 ) = 7.6 x 10 -75 p < 0.001 Multiple Hypothesis Testing § For hypothesis testing, we compare the significance level (p-value) calculated from the statistical test to a cut-off ! (type I error) ! is typically 0.05 or 0.01 § Decision Rule: • If the p-value < ! , then reject null hypothesis • If the p-value > ! , then accept the null hypothesis § If we test one hypothesis using an ! = 0.05, what is the probability of mistakenly rejecting the null hypothesis? § If we test five hypotheses using an ! = 0.05, what is the probability of mistakenly rejecting the null hypothesis? Chance of false positive = 1 - (0.95) N = 1 - (0.95) 5 = 1 - 0.774 = 0.226 N = number of tests

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
§ To correct for the error due to multiple hypothesis testing we use the Bonferroni Correction: • Where k = number of independent significance tests Bonferroni Correction for Multiple Hypothesis Testing " ' = k § For example if we test whether the set of genes up-regulated in prostate cancer have a significant overlap with 10 other cancer data sets, then k = 10. • If
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 5

Lecture14 - Hypergeometric Distribution Example Lecture 14...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online