pset3 - MIT OpenCourseWare http/ocw.mit.edu 6.00...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
MIT OpenCourseWare http://ocw.mit.edu 6.00 Introduction to Computer Science and Programming Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms .
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Introduction to Computer Science and Programming Problem Set 3 Handed out: Tuesday, September 16, 2008. Due: 11:59pm, Tuesday, September 23, 2008 . Introduction This problem set will introduce you to using functions and recursion, as well as string operations in Python. Collaboration You may work with other students. However, each student should write up and hand in his or her assignment separately. Be sure to indicate with whom you have worked. For further detail, please review the collaboration policy as stated in the syllabus. Submission This problem set, and future ones, will be graded by a test harness. The test harness program will expect your files to include just function definitions, with no executable code outside the function definitions (besides what's already in the template). So remember to comment out your testing code. (And *do* test your code thoroughly!). Strings and string searching As we have seen in lecture, strings are a common data type in many programming languages, and are used to represent textual information. You have already seen common examples of string searching. For example, finding words or phrases in documents involves searching one sequence of characters (i.e., the document) to find instances of another sequence of characters (the word or phrase to be found). Similarly, for Web searches such as Google, one needs to count instances of key words in documents, in order to rank pages. Matching strings: a biological perspective String matching is also is very valuable in less obvious settings, such as biology. A common problem in modern biology is to understand the structure of DNA molecules, and the role of specific structures in determining the function of the molecule. A DNA sequence is commonly represented as a sequence of one of four nucleotides – adenine (A), cytosine (C), guanine (G), or thymine (T) –and hence a DNA molecule or strand is represented by a string composed of elements from an alphabet of only four symbols, for example, the string AAACAACTTCGTAAGTATA represents a particular strand of DNA. One way to understand the function of a particular strand of DNA (or even a sub-strand of DNA) is to match that strand against a library of known DNA sequences – that is, sequences whose function and structure is known – with the idea that similar structure tends to imply similar function. Simple organisms such as bacteria may have millions of nucleotides in their DNA sequence, and the human chromosome is believed to have on the order of 246 million bases, so any matching scheme must be very efficient in order to be useful. In this problem set, we won’t ask you to build a practically useful tool, but hope to give you a sense of some of
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/12/2010 for the course EECS 6.00 taught by Professor Grimson during the Spring '08 term at MIT.

Page1 / 6

pset3 - MIT OpenCourseWare http/ocw.mit.edu 6.00...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online