The Lander-Green Algorithm
in Practice
Biostatistics 666
Lecture 21
Last Lecture:
Lander-Green Algorithm
m
m
i =2
i =1
L = . P ( I 1 ) P( I i | I i 1 ) P ( X i | I i )
I1
Im
Similar multipoint sib-pair analysis, but with:
More general definition for I, t
Biostatistics 666
Problem Set 8
Due April 6, 2006
Inheritance Vector Based Pedigree Analysis
1. Consider a pair of affected siblings genotyped for a marker with 2 alleles, with
frequencies p1 = 0.6 and p2 = 0.4. The two siblings and their mother were geno
Biostatistics 666
Problem Set 6
Due March 30, 2006
Multipoint Analysis
1. Two siblings were genotyped at three consecutive SNP markers. For the first
sibling the genotypes A/A, C/C, G/G were observed at the three markers. For the
second sibling, genotypes
Biostatistics 666
Problem Set 6
Due Thursday March 23
Linkage Analysis of Affected Relative Pairs
1. Consider a polymorphism with three alleles, with frequencies p1 = 0.50, p2 = 0.30
and p3 = 0.20.
a) If a sample of affected sibling pairs were collected,
Biostatistics 666
Problem Set 3
Due February 8, 2006
Modeling Events in the Coalescent
1. Consider the genealogy of a set of three sequences from a population of N = 2500
diploid individuals. Assume that the recombination rate per sequence per
generation
Biostatistics 666
Problem Set 1
Due Tuesday, January 24, 2005
1. Consider a population where allele frequencies differ between the sexes. Assume that
there are equal numbers of males and females and that genotypes occur in HardyWeinberg proportions within
Name: _
Major: _
1. Are you taking this course for credit?
If not, are you planning to attempt the homework sets?
2. Why are you interested in statistical modeling in genetic studies?
3.
What are you hoping to learn from this course?
4. Have you ever take
Parametric Linkage
Analysis
Biostatistics 666
Lecture 25
Last Lecture
Elston Stewart Algorithm
Can handle large pedigrees
Proceeds one nuclear family at a time
Limited to small numbers of markers
Calculates conditional probabilities for
sections of the
The Elston-Stewart
Algorithm
Biostatistics 666
Lecture 24
Scheduling Important Dates
Remaining Lectures, April 5, 7, 14
Polio Symposium, April 12
Rackham Auditorium, starts at 9:30
Review Session, April 19
Final Exam, April 27
Last Lecture
The Lander Gre
The Lander-Green
Algorithm
Biostatistics 666
Lecture 22
Last Lecture
Relationship Inferrence
Likelihood of genotype data
Adapt calculation to different relationships
Siblings
Half-Siblings
Unrelated individuals
Importance of modeling error
Today
The L
Checking Pairwise
Relationships
Lecture 19
Biostatistics 666
Last Lecture:
Markov Model for Multipoint Analysis
X1
X3
X2
P ( X 1 | I1 )
P( X 2 | I 2 )
P ( I 2 | I1 )
P( X 3 | I 3 )
I3
I2
I1
XM
P( I 3 | I 2 )
P( X M | I M )
IM
P (.)
IBD states along the ch
Multipoint Analysis for
Sibling Pairs
Biostatistics 666
Lecture 18
Previously
Linkage analysis with pairs of individuals
Non-parametric IBS Methods
Maximum Likelihood IBD Based Method
Possible Triangle Constraint
ASP Methods Covered So Far
Increasing
Modeling IBD for
Pairs of Relatives
Biostatistics 666
Lecture 17
Previously
Linkage Analysis of Relative Pairs
IBS Methods
Compare observed and expected sharing
IBD Methods
Account for frequency of shared alleles
Provide estimates of IBD sharing at ea
Replacing IBS with IBD:
The MLS Method
Biostatistics 666
Lecture 15
Previous Lecture
Analysis of Affected Relative Pairs
Test for Increased Sharing at Marker
Expected Amount of IBS Sharing
Previous Lecture:
Expected IBS Sharing
Calculated probability of I
IBS Methods for
Affected Pairs Linkage
Biostatistics 666
Lecture 14
Genetic Mapping
Compares the inheritance pattern
of a trait with the inheritance pattern
of chromosomal regions
Positional Cloning
Allows one to find where a gene is,
without knowing what
E-M for Haplotyping
Revisited
Biostatistics 666
Lecture 11
Example
Well estimate haplotype frequencies for the
example below
AA
Aa
aa
BB
25
20
4
Bb
30
12
0
bb
9
0
0
Stratification vs Disequilibrium
Ancestor
Present-day
Population A
Population B
Stratific
Haplotype Based
Association Tests
Biostatistics 666
Lecture 10
Last Lecture
Statistical Haplotyping Methods
Clarks greedy algorithm
The E-M algorithm
Stephens et al. coalescent-based algorithm
Hypothesis Testing
Often, haplotype frequencies are not
fin
Haplotyping
Biostatistics 666
Lecture 9
Last Lecture
Introduction to the E-M algorithm
Approach for likelihood optimization
Examples related to gene counting
Allele frequencies estimation
Haplotype frequency estimation
Today:
Other approaches for haplot
Maximum Likelihood Estimation
for Allele Frequencies
Biostatistics 666
Lecture 7
Last Three Lectures:
Introduction to Coalescent Models
Computationally efficient framework
Alternative to forward simulations
Predictions about sequence variation
Number of
Coalescent Models
With Recombination
Biostatistics 666
Lecture 6
So far
Basic Properties of the Coalescent
MRCA
Coalescence times
Number of mutations
Frequency spectrum of polymorphisms
Predicting number of variants in a sample
Today
Further refining
Distribution of Mutations
Biostatistics 666
Lecture 5
Last Lecture:
Introduction to the Coalescent
Coalescent approach
Proceed backwards through time.
Genealogy of a sample of sequences.
Infinite sites model
All mutations distinguishable.
No reverse m
Introduction to
Coalescent Models
Biostatistics 666
Lecture 4
Last Lecture
Linkage Equilibrium
Expected state for distant markers
Linkage Disequilibrium
Association between neighboring alleles
Expected to decrease with distance
Measures of linkage dise
Linkage Disequilibrium
Biostatistics 666
Last Lecture
Basic properties of a locus
Allele Frequencies
Genotype Frequencies
Hardy-Weinberg Equilibrium
Relationship between allele and genotype frequencies
that holds for most genetic markers
Exact Tests for H
Genes in Populations:
Hardy Weinberg Equilibrium
Biostatistics 666
Previous Lecture:
Primer In Genetics
How information is stored in DNA
How DNA is inherited
Types of DNA variation
Common designs for Genetic studies
Recommended Reading
Lander and Schork (
Biostatistics 666
Statistical Models in
Human Genetics
Instructor
Gonalo Abecasis
Course Logistics
Grading
Office Hours
Class Notes
Course Objective
Provide an understanding of statistical
models used in gene mapping studies
Survey commonly used algorithm