CSCI2820: Expectation-Maximization Haplotype Phasing
October 23, 2013
Informally, expectation-maximization can be described in 4 steps:
1. Initialize haplotype frequencies. The most common initialization is to give each haplotype equal probability. That i
g
2 C a 6 74 A Y5 7 @ T6 B 64 g 2e9 B @ 249 A6 2 D A@ 7 c A a 2 @4 39 2 C A 7 d R 6 B U R 2 A I 25 2@ @4 f e H W4 g A B C A
CI Y @ U 9 7 A I 2 Q EA4 P 2 V4 @ f d 6 B TP R 2 P4 9 I @ 2 R 2 P 6 B I c 25 P 7 9 8 H G F E D 2 C A 7 A 6 74 A Y5 7 @ k k l
e
e
1
CSCI2820: Medical Bioinformatics Homework 3 Solution
Problem Research Question Linkage Disequilibrium vs. Informativeness (35%)
We present here a sample of answers supplemented with key insights from students solutions.
Many dierent answers are consider
1
CSCI2820: Medical Bioinformatics Homework 2 Solution
Problem Tag SNPs in Dierent Populations (Required - 20%)
Analyze the dierences between the sets of tagging SNPs selected for each population.
Why does one population require more tag SNPs? Do allele f
1
CSCI2820: Medical Bioinformatics Homework 1
Due: 11:59PM October 1, 2013
This homework involves the use of Mathematica. You can either pass in the Mathematica notebook with the questions answered or a separate document. When going through the Mathematic
FfC
g
ed E Q 7 6 F y xC B @B G x
@ 6 5 @ 5I c G s x y x G ` Y X V H @ 9 G 9 wP @ 6 U R H V 7P
7 6 9 7 9 Q F D E E D B A 6 @ 9 8 7 6 5 4C B A 6 @ 9 8 7 6 5 4 B @ 9 G 9 @ R 9 w 9 5 @ 7 H 5 S 6 R 8 9 @ 9 A S U H I 9 5 @ s F xC
@ 6 5 @ 9 w R 9 G T B G 9 S
1
CSCI2820: Medical Bioinformatics Homework 1 Solution
Problem 1: A Bit of Mathematica Programming
1. Program the equation to compute r2 .
rsquared = (probs[1] - pi1plus*piplus1)^2 / (pi1plus*pi2plus*piplus1*piplus2);
2. Compute pairwise r2 for the SNP Ma
1
CSCI2820: Medical Bioinformatics Homework 2
Due: 11:59PM October 22, 2013
Please complete all problems marked required; also, complete one of the following
problems: Problem Expectation Maximization Phasing Algorithm or Problem LD and
tagging SNPs Revie
1
CSCI2820: Medical Bioinformatics Midterm
Due: 11:59PM November 14, 2013
Choose to complete either Problem 2 (BIO) or Problem 3 (COMP) problem but not both. The
other questions are required. Please handin your submission by emailing it to Derek [email protected]
Medical Bioinformatics (CSCI295-L)
Genome-Wide Association Studies, Protein Folding and
Immunogenomics
Sorin Istrail
November 2010
SYLLABUS
Contents
1 Short Introductions
2
2 The Hardy-Weinberg Model
3
3 Linkage Disequilibrium and GWAS
3
4 Tagging SNPs an
1
CSCI2820: Medical Bioinformatics Homework 4
Problems 1-2 are due on November 27, 2013.
Problem 3 is due on December 4, 2013.
Please handin your submission of Problems 1 and 2 by emailing it to Derek [email protected]
with subject csci2820 hw4 handin or l
1
CSCI2820: Medical Bioinformatics Homework 4 Solution
Problem 1: Minichiello-Durbin Algorithm and Ancestral Recombination Graph Reconstruction
Three individuals, changing objectives (20%)
Because there is a 1 in each position and only mutation can conver
1
CSCI2820: Medical Bioinformatics Homework 3
Due: 11:59PM November 5, 2013
Handin your submission by emailing it to Derek [email protected] with subject
csci2820 hw3 handin or in class if hand-written.
Problem 0: Research Question Linkage Disequilibrium v