ComputerEx3_2010_NO KEY_Intro to Bioinformatics

ComputerEx3_2010_NO KEY_Intro to Bioinformatics - C. Baer,...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
C. Baer, PCB 4674 Fall 2010 Computer Exercise 3: Introduction to Bioinformatics THIS ASSIGNMENT IS DUE IN CLASS TUESDAY 10/19. PLEASE LABEL YOUR PAPER ONLY BY YOUR UFID# AND DAY OF LAB . In this exercise you will retrieve several sequences from the NCBI database ("GenBank") and perform basic manipulations that are prerequisites to much of evolutionary biology (and all biology). The gene in question is the Alcohol Dehydrogenase B1 subunit from three species of primates (human, chimp, gorilla) and one rodent (mouse). Note: I tried to find the same gene from the recently sequenced Neanderthal genome but I couldn't find it. Too bad, that would have been fun. There are 19 questions, labeled Q1-Q19 in bold text. You must answer all of them. The assignment is worth 25 points, each question is worth 1.25 points, you get one free correct answer for following the directions. I. Retrieving Sequence Data from GenBank 1. Go to the NCBI web page ( ) and take a few minutes to read the introductory material on the web page. NCBI (formerly known as "GenBank") is one of the major international repositories of genomic information. For the purposes of speeding up this exercise, we will begin with the assumption that we know the exact identifying information for the sequences in which we are interested. Species Accession number Homo sapiens (human) NM_000668 Pan troglodytes (chimp) AB188285 Gorilla gorilla ( gorilla) AF354624 Mus musculus (mouse) NM_007409 2. Go to NCBI Home search = nucleotide put in the first accession file # (human) Go 3. When the results appear, click on the file accession number and wait for the screen to open. The results should look like: LOCUS NM_000668 2666 bp mRNA linear PRI DEFINITION Homo sapiens alcohol dehydrogenase IB (class I), beta polypeptide (ADH1B), mRNA. ACCESSION NM_000668 VERSION NM_000668.4 GI: 160298141 KEYWORDS . SOURCE Homo sapiens (human) ORGANISM Homo sapiens Etc. Scroll through the results and spend a few minutes digesting the information. Pay particular attention to the information following "Features", including "gene" and "CDS"; "CDS" stands for C oding D NA S equence. In fact, record the start and end position of the CDS for later reference. At the bottom of the page is raw DNA sequence, as follows: ORIGIN 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
C. Baer, PCB 4674 Fall 2010 1 aatatatctg ctttatgcac tcaagcagag aagaaatcca caaagactca cagtctgctg 61 gtgggcagag aagacagaaa cgacatgagc… Note that the sequence is from an mRNA, i.e., is cDNA sequence. Pay particular attention to the location of the start codon, in this case at position 85. 4. Go to the top of the page click on “Send” in the right corner. Click on “Complete Record”, in Choose destination click on “File”, in the Format drop-down menu, choose “FASTA”, and click on Create File. This will download a file called “sequences.fasta” to your computer. Rename the downloaded file to “SPECIES.cDNA.adh.fasta”, where SPECIES is whichever species you
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/08/2011 for the course PCB 4674 taught by Professor Baer during the Fall '08 term at University of Florida.

Page1 / 6

ComputerEx3_2010_NO KEY_Intro to Bioinformatics - C. Baer,...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online