4D58760Bd01 - Systems Biology Systems Biology The variable...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Systems Biology Systems Biology The variable genome Prof. M. Zabeau Department of Plant Systems Biology University of Gent Academic year 2009-2010 2009- Sequence variations in the human q genome The structure of sequence variations in the genome linkage disequilibrium haplotype blocks p yp Academic Year 2009 - 2010 1 Systems Biology Sequence Variations in the Human Genome Human sequence variation results from SNPs (single nucleotide polymorphisms) SNPs are the result of very rare replication errors in which a wrong base remains incorporated in the newly synthesized strand SNPs occur on average every 1,000 bases between individuals M j source of genetic variation i th h Major f ti i ti in the human population l ti Indels (Insertions & deletions) Majority are small (1 to 5 bp) insertions or deletions Arise as a result of replication slippage Remainder comprise repeat length polymorphisms and larger Indels Far fewer than SNPs, but recent evidence shows that indels represent the most variable part of the human genome Sequence Variations in the Human Genome Human sequence variation is responsible for Phenotypic variation between individuals From hair & skin color to behavior Di Disease susceptibility ibili Heritable rare (Mendelian) and common human diseases Mapping and characterization of the sequence variation in the human population Prerequisite for understanding the role of genetic variation in human biology, physiology and behavior biology Initially focused on the structural organization of SNPs Structural variation was underestimated at that time Academic Year 2009 - 2010 2 Systems Biology Structure of sequence variation in the q human genome Landmark studies in 2001 demonstrated the basic structure of sequence variation in the human genome 1) Linkage disequilibrium in the human genome is much more extensive than was predicted primarily short defined regions - recombination hotspots Extended regions devoid of recombination Haplotype blocks represent the structure of genetic variation in genet c var at on n the human genome 2) M i ti recombination i punctuate, and occurs Meiotic bi ti is t t d 3) SNPs are structured in haplotype blocks 4) Haplotype blocks across chromosome 21 Foundation for the HapMap project Linkage Disequilibrium & Haplotypes SNP Ancestral chromosomes Allele 1 Allele 2 N generations No recombination # recombination events Current chromosomes linkage disequilibrium (LD) g q Haplotype 1 p yp Haplotype 2 linkage equilibrium g q Haplotype: a particular combination of alleles along a chromosome Academic Year 2009 - 2010 3 Systems Biology BlockBlock-like Haplotype Diversity at 5q31 genes haplotypes Block size (kb) Haplotype 1 Haplotype 2 Haplotype 3 Haplotype 4 H l t total 94% 96% 1 84 76% 18% 2 3 77% 3 14 36% 26% 28% 90% 4 30 37% 14% 19% 21% 91% 5 25 9% 6 11 9% 7 52 40% 14% 27% 12% 1 % 93% 8 21 38% 8% 31% 7% 84% 9 27 36% 10% 33% 9% 88% 10 55 42% 8% 36% 86% 11 19 29% 16% 51% 96% 35% 41% 19% 13% 29% 35% 18% 92% 97% Reprinted from: Daly et. al., Nature Genet. 29, 229 (2001) BlockBlock-like Haplotype Diversity at 5q31 The haplotype blocks are separated by hotspots of recombination short intervals in which all independent historical recombination events h t have occurred d Most historical recombination events are clustered between the haplotype blocks little or no recombination in the blocks Historical recombination events Reprinted from: Daly et. al., Nature Genet. 29, 229 (2001) Academic Year 2009 - 2010 4 Systems Biology Haplotype patterns Haplotype patterns in regions devoid of historic recombination are simple because genetic diversity arises only by mutation exhibit simple genealogical relationships SNPs arise on individual chromosomes in the population which already carrying older haplotypes Haplotypes represent branches of a genealogical tree SNPs on the same branch are perfectly correlated Always co-occur in all samples co occur Exhibit complete linkage disequilibrium SNP on diff SNPs different b t branches can h h have imperfect correlation with SNPs on other braches no correlation with SNPs on other braches The International HapMap Consortium et. al., Nature 437, 1299 (2005) Haplotypes are branches of genealogical trees 36 SNPs within region ENr131 Haplotype 1 Haplotype 2 Haplotype 3 genealogical t l i l tree Haplotype 4 Haplotype 5 Haplotype 6 Haplotype 7 The International HapMap Consortium et. al., Nature 437, 1299 (2005) Academic Year 2009 - 2010 5 Systems Biology Haplotype tag SNPs 36 SNPs give only 7 haplotypes in 120 European chromosomes 236 different haplotypes 7 haplotypes Perfectly correlated SNPs Are in full linkage disequilibrium E hibit r2=1 or D' 1 Exhibit 1 D'=1 Can be assayed by one tag SNP proxy f all correlated SNP for ll l t d SNPs Imperfectly correlated SNPs Are in partial linkage disequilibrium Exhibit r2<1 (0,5 to 0,8) Can be assayed by one tag SNP Perfectly correlated SNPs The International HapMap Consortium et. al., Nature 437, 1299 (2005) Implications of Haplotype blocks Haplotype blocks can be treated as alleles in genome wide association studies genome-wide provide a crisp approach for testing the association of genomic segments with di t ith disease Haplotype tag SNPs are p yp g subsets of SNPs that can be used to uniquely distinguish the different common haplotypes in each block The consequence of the haplotype block structure is A subset of all the SNPs in the genome is sufficient for wholegenome association analysis The primary objective of the HapMap project Reprinted from: Daly et. al., Nature Genet. 29, 229 (2001) Academic Year 2009 - 2010 6 Systems Biology The HapMap project Discovery of SNPs in the human genome The haplotype map of the human genome First version Second version Discovery of human SNPs The SNPs in the world population are estimated at ~9-10 million common SNPs (1 SNP per 300 bases) With a minor allele frequency (MAF) > 5% constitute 90% of the variation in the world population p p ~30 million rare SNPs With a minor allele frequency (MAF) < 5% constitute 10% of the variation in the world population First inventory of 2001 comprised 1 4 million SNPs 1,4 SNPs were discovered by The public Human Genome Sequence Project (HGP) The SNP Consortium (TSC) a public/private consortium International SNP discovery effort Newly discovered SNPs are collected in a public database dbSNP http://www.ncbi.nlm.nih.gov/SNP/ Academic Year 2009 - 2010 7 Systems Biology Public database dbSNP H Human SNP i the d SNPs in h database b 11,8 million non-redundant SNPs , 5,7 million validated SNPs by SNP assays non-redundant SNP d d SNPs validated SNPs double-hit SNPs The International HapMap Consortium et. al., Nature 437, 1299 (2005) The International HapMap Project Objective was to genotype common SNPs in 270 individuals from four geographically diverse populations 30 trios (father, mother and child) from European descent (CEU) 30 t i f trios from Af i African (Y (Yoruba t ib f b tribe from Ni Nigeria - YRI) i 90 unrelated samples from Asian descent 44 Chinese individuals (CHB) 45 Japanese individuals (JPT) HapMap v 21a comprises a total of 3,1 million SNPs Phase I map comprised a first 1,0 million SNPs 10 High resolution SNP maps from 10 random 500kb regions (ENCODE) Ph s II map comprised an additional 2 1 million SNPs Phase is d dditi l 2,1 illi Represents 2535% of the common SNPs (MAF> 5%) stringent QC of the SNP genotypes: accuracy > 99.5% Reprinted from: The International HapMap Consortium, Nature 449, 851-861 (2007) Academic Year 2009 - 2010 8 Systems Biology Conclusions from the HapMap Project The picture of HapMap is fairly complete Regarding LD around common variants in the sampled populations Is unlikely to change much with additional data Updated HapMap information is available at: http://www.hapmap.org/ One important concern is Utility of the HapMap SNPs across all world populations Addressed in the next paper Conrad et al., Nat. Genet. 38, 1251 - 1260 (2006) Genome-wide association studies using HapMap SNP genotyping data Delivered an explosion of genes implicated in common diseases The International HapMap Consortium et. al., Nature 437, 1299 (2005) Selection of haplotype tag SNPs Phase II HapMap captures most of the common SNPs Any common SNP is linked at high LD (r2) to a typed SNP r2 ranges from 0,90 in YRI to 0,95 in CEU, CHB+JPT Phase II HapMap is a complete resource for selecting tag SNPs Number of tag SNPs required to capture common SNPs Depends on the population and the coverage determined by r2 African populations require twice as many tag SNPs Threshold CEU 290,969 552,853 1,024,665 CHB+JPT 277,831 520,111 1,078,959 YRI 627,458 1,093,422 1,616,739 r2 r2 0.5 0.8 r2 = 1.0 Reprinted from: The International HapMap Consortium, Nature 449, 851-861 (2007) Academic Year 2009 - 2010 9 Systems Biology The future of the HapMap Project Extend the HapMap by Extensively sequencing and genotyping additional samples from the HapMap populations samples from 7 additional populations l f dditi l l ti Luhya in Webuye, Kenya; Maasai in Kinyawa, Kenya; Tuscans in Italy; Gujarati Indian in Houston, Texas, USA; Denver (Colorado) metropolitan Chinese community; US people of Mexican origin and African ancestry Providing information on rarer variants Enabling genome wide association in additional populations genome-wide populations. Whole-genome sequencing will provide a natural convergence of technologies to type both SNP and structural variation. Reprinted from: The International HapMap Consortium, Nature 449, 851-861 (2007) The human genome diversity project Project collected ~1.000 samples from 52 worldwide populations Comprising 1064 cultured lymphoblastoid cell lines from individuals in 52 different world populations Human genome diversity cell line panel - HGDP-CEPH panel g y p p Reprinted from: Cann et. al., Science 296: 261 - 262 (2002) Academic Year 2009 - 2010 10 Systems Biology Haplotype structure in diverse populations Haplotype structures in nearby populations are similar Especially for populations from the same continent continent. Reprinted from: Conrad et al., Nat. Genet. 38, 1251 - 1260 (2006) World map of haplotype diversity Progressive decline of haplotype diversity correlated with the distance from Africa shows a serial dilution of genetic variation through populations Africa exhibits a complex and variable mosaic of haplotypes Oceania and the Americas exhibit fewer and simpler haplotypes Decline along the path of human migrations from Africa into the M ddle East Afr ca nto Middle from Middle East to Europe and Central and South Asia from Central/South Asia to East Asia from East Asia to Oceania and the Americas Global patterns are consistent with Phylogenetic tree of the human population "Out of Africa" model of the migration of modern Homo sapiens Reprinted from: Conrad et al., Nat. Genet. 38, 1251 - 1260 (2006) Academic Year 2009 - 2010 11 Systems Biology World map of haplotype diversity Reprinted from: Conrad et al., Nat. Genet. 38, 1251 - 1260 (2006) The migration of modern Homo sapiens Genetic and archaeological evidence supporting the "Out of Out Africa" spread of humans with a First expansion within Africa from East Africa at 100 kya Second expansion from Africa into Eurasia at 40-60 kya occupation of A i E ti f Asia, Europe and O d Oceania at 60 40 k i t 60-40 kya occupation of America at 15-35 kya Reprinted from: Cavalli-Sforza L. & Feldman M., Nat Genet. 33:266-75 (2003) Academic Year 2009 - 2010 12 Systems Biology The structural variation mapping pp g project Mapping structural variation in the human genome Overview of the recent discoveries Copy number variations humans are ubiquitous py q Towards a complete map of structural variation in the human genome Structural variation in the human genome Recently it has become increasingly recognized that Major proportion of genetic difference in humans is due to common structural variation of the genome including copy-number variants (CNV ) b i t (CNVs) Insertions, deletions and duplications b l balanced chromosomal rearrangements d h l inversions and translocations New insight results from Improvements in microarray technologies New methods for structural variant detection s sequencing-based or SNP b s d i b s d SNP-based Reprinted from: J. Sebat, Nat. Genet. 39, S3 - S5 (2007) Academic Year 2009 - 2010 13 Systems Biology Structural polymorphisms and disease S Structural changes in genes are well known l h i ll k Rare Mendelian diseases color blindness, haemophilia and thalassaemia Genomic disorders caused by large chromosome rearrangements PraderWilli syndrome and velocardiofacial syndrome Reprinted from: Eichler, et al.. Nature 447, 161165 (2007) Methods for detecting structural variation Microscopic methods Cytogenetic detection of structural variation fluorescence in situ hybridization (FISH) Genome-wide array-based methods Array-based comparat ve genome hybridization (array-CGH) Array based comparative hybr d zat on (array CGH) Robust for genome-wide scans of copy-number variants (CNVs) Insertions deletions and duplications Insertions, Misses balanced rearrangements (inversions and translocations) Two types of microarrays Genome-wide arrays of BAC clones or long oligonucleotides Hi h density SNP genotyping arrays High d it t i Sequencing-based methods q g Genome-wide paired-end sequencing Massive parallel sequencing Reprinted from: Feuk et al., Nature Rev. Genet. 7, 8597 (2006) Academic Year 2009 - 2010 14 Systems Biology ArrayArray-based detection of copy-number variants copy Array-based comparative genome hybridization CGH Detects copy number differences from copy-number fluorescence ratios (Cy3:Cy5) between the two DNA samples Reprinted from: Feuk et al., Nature Rev. Genet. 7, 8597 (2006) Two CNV genome hybridization platforms C Comparative genome h b idi i hybridization (BAC) array i Better at detecting larger deletions g g 500.000 SNP microarray (Affy) B tt at d t ti smaller d l ti Better t detecting ll deletions Reprinted from: Redon et. al., Nature 444, 444-454 (2006) Academic Year 2009 - 2010 15 Systems Biology The two CNV detection platforms CNV detection using log2 ratio of copy number Example large duplication in chr 8 identified on both platforms WGTP array 500k SNP array Reprinted from: Redon et. al., Nature 444, 444-454 (2006) Global analysis of the CNVs discovered Discovered a total of 1,447 CNVRs Size range varies f i i from q f few kb to several Mb l 66% detected on the two platforms Reprinted from: Redon et. al., Nature 444, 444-454 (2006) Academic Year 2009 - 2010 16 Systems Biology Genomic distribution of CNVs CNVs are heterogeneously distributed in the genome CNVs are often associated with segmental duplications Half of the gaps in the genome sequence are flanked by CNVs n NVs CNVs arise by recombination between duplicated segments Reprinted from: Redon et. al., Nature 444, 444-454 (2006) Genomic impact of CNVs Overlap of CNVs and functional elements CNVs are preferentially located outside genes and ultra ultraconserved elements B tf But functional sequences often l ti l ft located within and fl ki CNV t d ithi d flanking CNVs Duplications overlap genes more often than deletions deletions are under stronger purifying selection than duplications Genes overlapping with CNVs Enriched for certain GO categories cell adhesion, sensory perception of smell and of chemical stimulus adhesion Under-represented for certain GO categories D Dosage sensitive genes i i Cell signaling, cell proliferation and kinase- and phosphorylationrelated categories l t d t i Reprinted from: Redon et. al., Nature 444, 444-454 (2006) Academic Year 2009 - 2010 17 Systems Biology Database of Genomic Variants Database for structural variants in the human genome Currently ~4.000 entries ~4 000 Far fewer than SNPs in dbSNP DGV: http://projects.tcag.ca/variation/ Human structural variation map Objective of the new project Map common structural variation in the HapMap individuals Using sequencing methods that are well suited to discover and characterize all forms of structural variation Not only copy-number variants (CNVs) Inserti ns deletions and duplications Insertions, deleti ns duplicati ns But also balanced chromosomal rearrangements inversions and translocations Goals of the project Discover the common structural variation Characterize the structural variations at the sequence level q Complete the map of human genetic variation Map the structural variants onto the reference HapMap Reprinted from: Eichler, et al.. Nature 447, 161165 (2007) Academic Year 2009 - 2010 18 Systems Biology PairedPaired-end sequence approach Sequencing of two types of clone libraries from # individuals fosmid libraries (40 kb inserts): inserts with SD of 1 5 kb 1.5 allows detection of structural variation as small as 5 kb (= >3SD) BAC libraries (~150 kb inserts): wide insert size distribution ( 150 allows detection of structural variation >50kb Reprinted from: Eichler, et al.. Nature 447, 161165 (2007) Medical applications Genome-wide association scanning for risk factors in common di diseases The Human cancer genome project Academic Year 2009 - 2010 19 Systems Biology Heritable human diseases Heritable 'mendelian' disorders Rare disorders with very low frequencies (<1/1000) High heritability C Caused by variation i single genes db i ti in i l Typical familial inheritance E.g. CF, syndromes... Great progress: > 1000 genes have been identified in which Common human diseases Frequent disorders that are common in the population (>10%) E g cancer diabetes, autoimmunity, cardiovascular and psychiatric E.g. cancer, diabetes autoimmunity diseases Caused by the combined effect of many different genes interacting with environmental factors Sl Slow progress: only a f l few genes id tifi d so f identified far due to the inherent limitations of the methods for genetic analysis The International HapMap Consortium et. al., Nature 437, 1299 (2005) Genetic analysis of common human diseases Past studies of common diseases used two approaches Family based linkage studies across the entire genome Family-based Limitation: linkage analysis has low power except when a single l i l locus explains a substantial f l i b t ti l fraction of di ti f disease Population-based association studies of candidate genes Limitation: candidate genes imply that one has knowledge on the molecular processes involved in the disease Only very few studies were unsuccessful Validated genetic variants involved in common diseases include APOE4 in Alzheimer's disease PTPN22 and CTLA4 in type 1 diabetes PPARG and KCNJ11 in type 2 diabetes NOD2 in i fl i inflammatory b t bowel di l disease HLA: autoimmunity and infection The International HapMap Consortium et. al., Nature 437, 1299 (2005) Academic Year 2009 - 2010 20 Systems Biology GenomeGenome-wide association scans Finding the risk factors in common diseases requires Large scale systematic scanning of common genetic variants in large populations (100s to 1000s) of affected i di id l ff t d individuals healthy individuals Low cost and high density SNP arrays made Large scale genome-wide scanning of common SNPs feasible 2006: arrays for genotyping of ~500.000 SNPs 2007: arrays f genotyping of ~ 1 million SNPs and CNV probe s for t i f illi d b Genome-wide association scans The first large scale scans were conducted in 2006 The International HapMap Consortium et. al., Nature 437, 1299 (2005) GenomeGenome-wide association scans Measure the allele frequencies of 500,000 SNPs in cohorts of 500 000 patients controls Statistical data analysis shows Most SNPs show no allele frequency difference in the two cohorts Rare SNPs show a clear difference in allele fr qu nc ll l frequency The frequent SNP allele in patients represents the n tic r pr s nts th genetic risk f ct r factor Reprinted from: Bowcock, Nature 447, 645-646 (2007) Academic Year 2009 - 2010 21 Systems Biology GenomeGenome-wide association scans Candidate SNP alleles exhibit statistically significant association association study f type 2 di b i i d for diabetes with 386 731 SNP i h 386,731 SNPs candidate SNP alleles ll l Reprinted from: Couzin and Kaiser, Science 316, 820 - 822 (2007) GenomeGenome-wide association scans Association scans define genetic risk factors as SNP alleles showing statistically significant associations with the disease Th SNP d fi The defines th genomic position of th genetic f t the i iti f the ti factor The SNP allele increases the risk of the disease and can be causal: causing the disease l i h di lined to causal SNP allele: linkage disequilibrium Genetic factors identified in association scans Must be validated in independent replication studies other patient control cohorts Often represent only a small fraction of the overall disease risk Clinical relevance needs to be demonstrated further Academic Year 2009 - 2010 22 Systems Biology Selected recent genome-wide scan results genomePublication date 2005 2006 2007 2007 2007 2007 Disease Sample l * size 1700 4500 17,500 38,700 32,500 41,600 Genes or G variants found 1 new gene 1 new gene 2 variants in same region (1 new) 1 new gene 9 variants (3 new) 1 new variant Approximate pp increased risk for homozygote yg 400% to 600% 120% 123% 67% 80% 25% to 40% Macular degeneration Inflammatory bowel disease Prostate cancer Obesity Type 2 diabetes Heart disease Reprinted from: Couzin and Kaiser, Science 316, 820 - 822 (2007) GenomeGenome-wide association study of 14,000 cases of seven y common diseases and 3,000 shared controls The W ll Th Wellcome Trust Case Control Consortium, N t T tC C t lC ti Nature 44 661 6 8 (2007) 447, 661-678 Paper presents the results of a large collaborative genome-wide association study for 7 common diseases rheumatoid arthritis hypertension Crohn's disease - inflammatory bowel disease y coronary artery disease b polar disorder man c depress on bipolar d sorder - manic depression type 1 diabetes type 2 diabetes Academic Year 2009 - 2010 23 Systems Biology The Wellcome Trust Case Control Consortium study The study used a powerful study design using ~2 000 patients per disease ~2.000 ~3.000 shared controls Th study is groundbreaking The d i db ki Confirms the involvement of previously reported disease genes p y p g Identifies the involvement of several novel genes that affect susceptibility to common diseases. diseases Models a successful and instructive approach to large-scale genomic scans showing that a set of common controls can be used for a variety of diseases with relatively little loss of analytical power Reprinted from: The Wellcome Trust Case Control Consortium, Nature 447, 661-678 (2007) The human cancer genome Cancer is - in essence - a genetic disease Affecting one in three people in the Western world What we know The challenges for the next decade Many important genes responsible for the genesis of various cancers the pathways through which they act The discovery of new genes that have a causal role in neoplasia, particularly those that initiate and conclude the process The delineation of the pathways through which these genes act and the basis for the varying actions in specific cell types f y g p f yp The development of new ways to exploit this knowledge for the benefit of patients and the treatment of cancer An outstanding review: "Cancer genes and the pathways they control Cancer control" Vogelstein and Kinzler, Nat. Med. 10, 789 (2004) Academic Year 2009 - 2010 24 Systems Biology The Human Cancer Genome Project The proposed project comprises a 10-year one billion $ effort to identify all major mutations in 10 year the most common human cancers P Proposal: systematic search for the common mutations (5% l t ti h f th t ti frequency) in 250 tumor samples from 50 major cancer types Rationale: novel genes will support the development of new d l l ll h d l f drugs Progress in the human cancer genome project g g p j Large scale sequencing of protein-coding genes in DNA from tumors First two papers on the genomic landscapes of human breast and colorectal cancers appeared recently Strategy for identifying mutations Studied two common tumor types breast and colorectal cancers Major clinical importance worldwide 2 2 million cancer di 2.2 illi diagnoses (20% of th t t l) f the total) 940,000 cancer deaths each year (14% of the total) set of 13.000 consensus coding sequences (CCDS) that represent the most highly curated protein-coding gene set currently available The goals of the study were to Develop a methodological strategy for conducting genome wide genome-wide analyses of cancer genes in human tumors D t Determine th spectrum and extent of somatic mutations i i the t d t t f ti t ti in human tumors Identify new cancer genes and molecular pathways that could lead to improvements in diagnosis or therapy Reprinted from: Sjblom et. al., Science 314: 268 274 (2006) Academic Year 2009 - 2010 25 Systems Biology Mutation discovery screen first study 13.023 genes ~135.000 amplicons 22 tumor samples 816.000 ~816 000 changes ~20.000 candidates 1.307 tumor mutations in 1.149 genes Reprinted from: Sjblom et. al., Science 314: 268 274 (2006) Mutation validation screen first study 1.149 enes 1 149 genes 24 tumor samples ~134.000 changes 34.000 ~2.500 ~2 500 candidates 365 tumor mutations in 236 genes Reprinted from: Sjblom et. al., Science 314: 268 274 (2006) Academic Year 2009 - 2010 26 Systems Biology Cancer genome landscapes Screens yielded 280 candidate cancer (CAN) genes CAN genes ranked by cancer mutation prevalence (CaMP) score number and nature of the mutations b d f h i Mutational landscapes of genes in two tumours p g mutated genes represented by dots on a 2-dimensional map corresponding to its chromosomal position Peaks represent CAN-genes with the 60 highest CaMP scores G Gene mountains represent well known oncogenes t i t ll k Reprinted from: Wood et al., Science 318, x-x (2007) Candidate cancer genes Number of mutant CAN genes per tumor Breast cancers: an average 14 mutant CAN genes Colorectal cancers: an average of 15 mutant CAN genes CAN genes fall in three classes Previously known genes to be mutated in human cancers Validates the screen Identified all the known genes and in addition identified also Sporadically mutated genes Genes found in other tumors Genes linked to cancer through functional studies but in which no previous mutations i cancers were k i t ti in known Genes that were not suspected to be involved in cancer large number of the CAN genes Reprinted from: Sjblom et. al., Science 314: 268 274 (2006) Academic Year 2009 - 2010 27 Systems Biology New view of cancer Metastatic tumors carry a larger number of mutations Cancers harbor an average of 14 to 15 mutations in CAN genes explains their biochemical, biological, and clinical heterogeneity p , g , g y Each mutation is associated with a small fitness advantage in driving tumour progression p g Cancers harbor fewer mutated pathways Since many gene belong to pathways the number of mutated pathways is estimated at not more than 20 Many genes have unknown functions in cancer Large-scale mutational analyses proves useful for identifying novel genes involved i h l i l d in human cancer Several of these were recently validated Reprinted from: Wood et al., Science 318, x-x (2007) Future implications Cancer genome sequencing Methodological strategy must be further improved Improving the efficiency of detection of genuine mutants S Screen a l larger number of t b f tumor samples l Understanding the precise role of the genetic alterations in tumorigenesis will be more challenging ll b h ll Ultimately the screens should also cover non-coding sequences y g q Need new tools to identify "functional" mutations in such sequences The cancer genome landscape shows that Personal cancer genomics is becoming reality P Personal genomic data on cancers can be exploited l d b l d Personalized immunotherapy Biomarkers for monitoring tumor progression Reprinted from: Sjblom et. al., Science 314: 268 274 (2006) Academic Year 2009 - 2010 28 Systems Biology Recommended reading Human Haplotype Map The Structure of Haplotype Blocks in the Human Genome Daly et. al., Nature Genet. 29, 229 (2001) Haplotype map of the human genome p yp p g The International HapMap Consortium, Nature 449, 851 (2007) Human Cancer G n m anc r Genome Sequencing cancer genomes Sjblom et. al., Science 314: 268 274 (2006) et al Further reading Sequence variations in the human genome A map of human genome sequence variation The International SNP Map Working Group Nature 409 928 (2001) Group, 409, Reich et al Nature 411 199 (2001) et. al., 411, Linkage disequilibrium in the human genome Recombination hotspots in the human genome Jeffreys et al Nature Genet 29 217 (2001) et. al., Genet. 29, Patil et al Science 294: 1719 (2001) et. al., Science, Hinds et. al., Science. 307: 1072-1079 (2005) The Structure of Haplotype Blocks in the Human Genome H Human evolution l ti Application of molecular techniques to study Human evolution Cavalli-Sforza L. & Feldman M., Nat Genet. 33:266-75 (2003) Academic Year 2009 - 2010 29 Systems Biology Further reading The HapMap project Objectives of the HapMap project Th International HapMap Consortium, Nature 426, 789 - 796 (2003) The I t ti l H M C ti N t 426 The International HapMap Consortium et. al., Nature 437, 1299 (2005) The International HapMap Consortium, Nature 449, 851 (2007) Hinds et. al., Science. 307: 1072-1079 (2005) Sabeti, P. C. et al., Nature 449, 913-918 (2007) First generation haplotype map of the human genome Second generation haplotype map of over 3,1 million SNPs Private mapping effort by Perlegen pp y Detection of positive selection in human populations p p p Ethics and science in the international HapMap Project Worldwide survey of haplotype var at on Worldw de variation Conrad et al., Nat. Genet. 38, 1251 - 1260 (2006) The International HapMap Consortium, Nature Reviews Genetics 5, 467 -475 (2004) Further reading Structural variation in the Human genome Overview of the recent discoveries J Sebat, N t G J. S b t Nat. Genet. 39 S3 - S5 (2007) t 39, Feuk et al., Nature Rev. Genet. 7, 8597 (2006) Redon et. al., Nature 444, 444-454 (2006) K b l et. al., S i Korbel t l Science 318 420 - 426 (2007) 318: Eichler, et al.. Nature 447, 161165 (2007) C Copy number variations h b i ti humans are ubiquitous bi it Complete map of structural variation in the human genome High throughput SNP Genotyping g g p yp g Sobrinoa et. al., Forensic Sci Int., 154:181-94 (2005) Fan et. al., Nature Rev. Genet. 7: 632-644 (2006) Gunderson et. al., Genome Res. 14:870-877 (2004) Academic Year 2009 - 2010 30 Systems Biology Further reading Genome-Wide Association Studies Overview Bowcock, Nature 447, 645-646 (2007) Sladek et. al., Nature 445, 881-885 (2007) The Wellcome Trust Case Control Consortium, Nature 447, 661-678 (2007) N Novel risk l i f t l i k loci for type 2 di b t diabetes G Genome-wide association study of seven common di id i ti t d f diseases Human Cancer Genome Cancer genes and the pathways they control ancer Vogelstein and Kinzler, Nat. Med. 10, 789 (2004) Sequencing cancer genomes Sjblom et. al., Science 314: 268 274 (2006) Wood et al., Science 318 x-x (2007) al 318, x x Academic Year 2009 - 2010 31 ...
View Full Document

This note was uploaded on 05/28/2010 for the course WE BIBI010000 taught by Professor Marnikvuylsteke during the Spring '10 term at Ghent University.

Ask a homework question - tutors are online