100%(1)1 out of 1 people found this document helpful
This preview shows page 45 - 52 out of 57 pages.
•This method will be completed using a 10x redundancy to eliminate errors and reduce the possibility of having misses any targeted regions. •The Celera Assembler is one of the core competencies and makes this Herculean task possible. •The first pass through the data the shotgunned fragments are compared against each other and equivalent sequences greater than 40 base pairs long identified. •These 40 base pairs matches are statistically impossible to occur by chance. These matches are then determined to be true or repeat induced. True matches are overlapping sections and are the desired fragments; repeat-induced fragments occur in multiple locations of the genome and do not belong together.
Whole Genome Shotgun Sequencing•The assembler then searches for overlapping fragments that have a common sequence and are not contested elsewhere in the dataset. •The uncontested data is assembled into unitigs containing approximately 30 fragments. •These assembled unitigs are 99 % accurate and repeats are filtered out using the Discriminator algorithm. •Unitigs passing this filter are identified and renamed U-untigs that are ready for ordering. •The scaffolding stage starts and the order found by looking at the mate pairs and organizing these into contigs. By constantly looking at these contigs and looking at the orientation the scaffold become complete except for some sequencing gaps. •This strategy is repeated until the gaps are filled using the Discriminator algorithm and a method using sequence “rocks” and “pebbles”.
Whole Genome Shotgun Sequencing•As HGP has been making public the incremental sequence the shotgun approach utilized this data to help eliminate errors and speed the scaffolding process.
Sequence GapsBrown. Genomes 2
Advances•The following advances in robotics and automation reduced the labor by 80% while combining the microbiological advances:–Development of Perkin-Elmer (ABI PRISM 3700) gene sequence.–1000 sample per day–15 minutes instead of 8 hours for first automated sequencers–A parallel system of 300 sequencers ($300,000 each)–Use of supercomputers to assemble fragments•Development of process support instrumentation to process 100 K template preps and 200 K sequence reactions per day.•24 hour per day unattended operation of sequencers
Map of Chromosome 16
Advances•In addition to the above advances the field of computational biology (bioinformatics) became increasingly important as the software and processors required to assemble a puzzle of this size still needed to be developed.
You've reached the end of your free preview.
Want to read all 57 pages?
DNA, Human genome, Energy Genome Programs, U.S. Department of Energy Genome Programs