genome barcode

genome barcode - Genome Barcodes and Applications...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Genome Barcodes and Applications Applications Ying Xu (徐鹰) A DNA Sequence DNA ccgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatcgtgtgg gtagtagctgatatgatgcgaggtaggggataggatagcaacagatgagcg gatgctgagtgcagtggcatgcgatgtcgatgatagcggtaggtagacttcg cgcataaagctgcgcgagatgattgcaaagragttagatgagctgatgctag aggtcagtgactgatgatcgatgcatgcatggatgatgcagctgatcgatgta gatgcaataagtcgatgatcgatgatgatgctagatgatagctagatgtgatc gatggtaggtaggatggtaggtaaattgatagatgctagatcgtaggtagta gctagatgcagggataaacacacggaggcgagtgatcggtaccgggctga ggtgttagctaatgatgagtacgtatgaggcaggatgagtgacccgatgagg ctagatgcgatggatggatcgatgatcgatgcatggtgatgcgatgctagatg atgtgtgtcagtaagtaagcgatgcggctgctgagagcgtaggcccgagag gagagatgtaggaggaaggtttgatggtagttgtagatgattgtgtagttgta gctgatagtgatgatcgtag ………………………………… Questions Questions • Which organism is this DNA sequence from? • Is it from a bacterium, fly or any other organism? • Is it really from one organism? • Is it a real DNA sequence? •… Meta-genome Binning Meta • The result of community sequencing is a collection of The sequence fragments from multiple genomes sequence – fragment size ranges ~600 – ~1500 bps fragment • We need to bin them first before assemble them • The meta-genome binning problem: separate the The meta separate sequence fragments into individual genomes sequence Meta-genome Assembly Meta • But how to bin them? … genome 1 genome 2 genome 3 genome N Is any part of the DNA transferred from another organism? from • ccgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatcgtgtgg gtagtagctgatatgatgcgaggtaggggataggatagcaacagatgagcg gatgctgagtgcagtggcatgcgatgtcgatgatagcggtaggtagacttcg cgcataaagctgcgcgagatgattgcaaagragttagatgagctgatgctag aggtcagtgactgatgatcgatgcatgcatggatgatgcagctgatcgatgta gatgcaataagtcgatgatcgatgatgatgctagatgatagctagatgtgatc gatggtaggtaggatggtaggtaaattgatagatgctagatcgtaggtagta gctagatgcagggataaacacacggaggcgagtgatcggtaccgggctga ggtgttagctaatgatgagtacgtatgaggcaggatgagtgacccgatgagg ctagatgcgatggatggatcgatgatcgatgcatggtgatgcgatgctagatg atgtgtgtcagtaagtaagcgatgcggctgctgagagcgtaggcccgagag gagagatgtaggaggaaggtttgatggtagttgtagatgattgtgtagttgta gctgatagtgatgatcgtag ………………………………… K-mer Frequencies • Combined K-mer frequency: frequency of each K-mer and its reverse complement – 4-mers: GGTA/TACC, CGAA/TTGC, GGTC/GACC, … mers: frequency genome sequence K-mer Frequencies • Genomes have highly stable combined K-mer Genomes highly frequencies, measured using small window size M – e.g., M = 1000 bps; K = 4; e.g., • This is true for all genomes, eukaryotic, prokaryotic, prokaryotic, chromosomal and organelle chromosomal Genome Visualization Genome • When mapping the frequencies to grey levels, each When frequencies to grey each genome can be visualized as a grey-level image genome – x-axis: combined K-mers (e.g., 4-mers), and mers), – y-axis: genome axis 136 combined 4-mers AAAA/TTTT frequency ACAG/CTGT CGAT/ATCG genome sequence Genome Barcodes Barcodes • Barcodes of various genomes P. furiosus B. pseudomallei E. coli O157 E. coli K-12 Genome Barcodes Barcodes • How about the barcode of a random sequence of {A, C, How G, T}? G, • No, you cannot fake a genome Random seq Properties of Genome Barcodes Properties • Majority of a prokaryotic genome’s short fragments have short highly similar barcodes highly P. furiosus B. pseudomallei E. coli O157 E. coli K-12 Abnormal Barcodes Abnormal • On average, 12-13% of genomic fragments in bacterial genomes have substantially different barcodes substantially Abnormal Barcodes Abnormal • This distance distribution suggests that we may be able to This figure out how long the transferred genes have been in the host rather than just which ones are the transferred genes genes Abnormal Barcodes Abnormal barcode distance We hope to establish a We strong correlation between barcode distance and how long a HTG has been in the host genome. host E col K12 Species Genus Family Class Phylum Domain Barcode Properties of Genomes Barcode • Different types of genomic regions tend to have their common and unique characteristics coding regions intergenic regions interoperonic regions Barcode Properties of Genomes • Different classes of genomes, i.e., eukaryotic, eukaryotic, prokaryotic, mitochondrial, plasmid, plastid, have their unique and identifiable characteristics unique Red: prokaryotes Blue: eukaryotes Green: plastids Orange: plasmids Black: mitochondria Why Barcode Properties Why 0th order Markov chain 1st order Markov chain 3rd order Markov chain 5th order Markov chain We believe that it is the Markov chain properties of the We prokaryotic DNA that give rise to the barcode property prokaryotic Barcode Properties Barcode But …. why do eukaryotic genomes also have (seemingly But more complex) barcode properties? Do different (major) regions of a eukaryotic genome also Do follow Markov chain models or more complex stochastic models? protein-coding genes (hidden Markov model) protein RNA-coding genes RNA regulatory regions regulatory repetitive elements repetitive …. This is something that we hope to answer someday! What Questions Can We Answer What • Which organism is this DNA sequence from? • Is it from a bacterium, fly or any other organism? • Is it really from one organism? • Is it a real DNA sequence? •… YES to all these and many other questions! Genome Barcode Server Genome • We have developed a computer server for barcode generation for a given genome • Create a barcode from DNA sequence • Detect subsequences with abnormal barcodes • Group DNA fragments into a bins with similar barcodes • Calculate various statistics of computed barcodes • Send a subsequence to BLAST for sequence search • Keep records of previous runs for later analysis Take-Home Message Take • Genome visualization could be a key to discoveries of Genome new genomic elements new • Barcodes represent only an initial effort along this Barcodes direction direction ...
View Full Document

Ask a homework question - tutors are online