Spring 2005 Problem A2 Robert F. Murphy Page 1 of 1 Problem A2 BLAST Due: February 8, 2005 In this assignment, you are given a sequence of a mouse cDNA in the file ProbA2.txt on the homeworks web page. Your goal is perform initial sequence analysis and comparison to find similar genes and proteins. Hand in requested printouts and written answers to the questions. Include justification for your answers (i.e., show basis of calculations). Label your answers clearly. Questions (Total of 40 points) 1. a) Which dinucleotide(s) is(are) the least frequent in the cDNA? (b) What is the predicted frequency of the tetranucleotide GGAT using the observed mononucleotide frequencies (show your calculations)? (c) What is the predicted frequency of GGAT using the observed dinucleotide frequencies (show your calculations)? 2. (a) What locations in the sequence match the consensus sequence "ARGCYT"? (b) Using the observed mononucleotide frequencies, what is the expected number of occurrences of this
