Lect8 Gene finding

An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE182-L8 Gene Finding
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Project EST clustering and assembly Given a collection of EST (3’/5’) sequences, your goal is to cluster all ESTs from the same  gene, and produce a consensus. Note that all the 3’ ESTs should line up at the 3’ end. 5’ and 3’ ESTs from the same clone should have the same clone ID, which should allow us to  recruit them (Noah, Tim, Jamal, Jesse) Input Output
Background image of page 2
Project Extra credit Some genes may be alternatively spliced and may have multiple  transcripts Can you deconvolute the information back from ESTs? ATG
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Project: Functional annotation of ESTs Given a collection of ESTs (and assembled transcripts), what is  their function? Use existing databases to annotate the Hirudo ESTs. Are any protein families under, or over represented? Specific families (Netrins/Innexins/Phosphatases)
Background image of page 4
Project on Indexing Indexing Even if you annotate all ESTs, what is the quick way for  someone to search the database to get all Innexins (for  example?) The keyword based index should be able to answer that  question Sergey and Dan
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
HW4 Optional/Required?
Background image of page 6
Computational Gene Finding Given Genomic DNA, identify all the coordinates of the  gene TRIVIA QUIZ! What is the name of the FIRST gene  finding program? (google testcode) ATG 5’ UTR intron exon 3’ UTR Acceptor Donor splice site Transcription start Translation start
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Gene Finding: The 1st generation Given genomic DNA, does it contain a gene (or not)? Key idea: The distributions of nucleotides is different in coding  (translated exons) and non-coding regions. Therefore, a statistical test can be used to discriminate between  coding and non-coding regions. 
Background image of page 8
Coding versus non-coding You are given a collection of exons, and a collection of intergenic  sequence. Count the number of occurrences of ATGATG in Introns and  Exons. Suppose 1% of the hexamers in Exons are ATGATG Only 0.01% of the hexamers in Intergenic are ATGATG How can you use this idea to find genes?
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Generalizing AAAAAA AAAAAC AAAAAG AAAAAT I E  Compute a frequency count for all hexamers.   Exons, Intergenic and the sequence X are all vectors in a  multi- dimensional space  Use this to decide whether a sequence X is exonic/intergenic.
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 46

Lect8 Gene finding - CSE182-L8 Gene Finding Project Input...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online