This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CIS 4930 / CIS 6930 (Recent Advances in Bioinformatics) Spring 2009, Homework 1 Due date: 02 / 16 / 2009 Turn in hard copy in class February 4, 2009 The purpose of this homework is to learn the edit operations in the frequency domain and gain familiarity with the DNA data type in FASTA format. You will implement a program that computationally analyzes data files that contain DNA sequences in FASTA format as described below. Data Source. You will download data from GenBank ( ftp://ftp.ncbi.nih.gov/genomes/ ) Pick three different organisms, one of them should be human (H.sapiens). H.sapiens chro- mosome will be your query sequence and the other two organisms will be your database sequence. Download one chromosome from each of these three organisms in fasta format. The extension of the fasta files in GenBank are “.fa” . For example, you can get human chromo- some 22 at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/CHR_22/hs_ref_chr22.fa.gz ....
View Full Document
This note was uploaded on 01/15/2012 for the course CIS 4930 taught by Professor Staff during the Spring '08 term at University of Florida.
- Spring '08