Computer Science 549 Computational Biology
Prof. Steven Skiena
Fall 2013
Homework 1
Due Thursday, October 3, 2013
September 17, 2013
Each of the problems should be solved on a separate sheet of paper to facilitate grading. Limit
the solution of each probl
Lecture 19: Introduction to NP-Completeness Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Reporting to the Boss
Suppose you fail to nd a fast algorithm. What can you tell
Lecture 22: The NP-Completeness Challenge Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Show that the Hitting Set problem is NP-complete: Input: A coll
Lecture 21: Other Reductions Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Show that the Dense subgraph problem is NP-complete: Input: A graph G, and i
Lecture 20: Satisability Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Suppose we are given a subroutine which can solve the traveling salesman decisio
Lectures 12, 13, and 14:
Gene Prediction
Steven Skiena
Department of Computer Science
State University of New York
Stony Brook, NY 117944400
http:/www.cs.sunysb.edu/skiena
Sequence Annotation
As new DNA sequence data becomes available, we seek to
identify
Lectures 8, 9, 10, and 11:
Homology Searching
Steven Skiena
Department of Computer Science
State University of New York
Stony Brook, NY 117944400
http:/www.cs.sunysb.edu/skiena
String Comparison
The two killer apps of modern string processing algorithms
h
Lectures 21, 22, and 23:
Phylogenic Trees and Evolution
Steven Skiena
Department of Computer Science
State University of New York
Stony Brook, NY 117944400
http:/www.cs.sunysb.edu/skiena
Phylogenic Trees
In any evolutionary process, speciation events caus
Lectures 15, 16, 17, and 18:
Microarrays
Steven Skiena
Department of Computer Science
State University of New York
Stony Brook, NY 117944400
http:/www.cs.sunysb.edu/skiena
Faster, Better, Cheaper
Biology used to be a hypothesis driven science.
But one of
Lectures 19, 20, and 21:
RNA and Protein Folding
Steven Skiena
Department of Computer Science
State University of New York
Stony Brook, NY 117944400
http:/www.cs.sunysb.edu/skiena
Shape and Structure of Molecules
The primary molecules of biological intere
Lectures 1, 2, and 3:
Preliminaries
Steven Skiena
Department of Computer Science
State University of New York
Stony Brook, NY 117944400
http:/www.cs.sunysb.edu/skiena
Administrivia
Make sure I get your name and email address written clearly,
as well as wh
Lectures 4, 5, 6, and 7:
Sequence Assembly
Steven Skiena
Department of Computer Science
State University of New York
Stony Brook, NY 117944400
http:/www.cs.sunysb.edu/skiena
Sequencing the Human Genome
The sequencing the human genome was a tremendous scie
CSE 549: Suffix Tries &
Suffix Trees
All slides in this lecture not marked with * of Ben Langmead.
KMP is great, but
|T| = m
|P| = n (note: m,n are opposite from previous lecture)
Without preprocessing
(KMP)
Given preprocessing
(KMP)
Without preprocessing
CSE 549: BWT & FM-Index
All slides in this lecture not marked with * courtesy of Ben Langmead.
Burrows-Wheeler Transform
Reversible permutation of the characters of a string, used originally for compression
$abaaba
a$abaab
aaba$ab
aba$aba
abaaba$
ba$abaa
CSE 549: Efficiently Dealing with
k-mers and De Bruijn Graphs
Scalability at the forefront
Ive spoken a lot in this class about the need for scalable
solutions, but how big of a problem is it?
Take (one of) the simplest problems you might imagine:
Given:
Lecture 15: Backtracking Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
The single-destination shortest path problem for a directed graph is to nd the s
Lecture 17: Edit Distance Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Suppose you are given three strings of characters: X, Y , and Z, where |X| = n,
Lecture 2: Asymptotic Notation Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
The knapsack problem is as follows: given a set of integers S = cfw_s1, s2
Lecture 1: Introduction to Algorithms Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
What Is An Algorithm?
Algorithms are the ideas behind computer programs. An algorithm
Lecture 5: Dictionaries Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Dictionary / Dynamic Set Operations
Perhaps the most important class of data structures maintain a s
Lecture 6: Hashing Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Dictionary / Dynamic Set Operations
Perhaps the most important class of data structures maintain a set of
Lecture 4: Elementary Data Structures Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
True or False? 1. 2n2 + 1 = O(n2 ) 2. n = O(log n) 3. log n = O( n)
Lecture 3: Program Analysis Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Find two functions f (n) and g(n) that satisfy the following relationship. If
Lecture 7: Heapsort / Priority Queues Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Take as input a sequence of 2n real numbers. Design an O(n log n) a
Lecture 11: Breadth-First Search Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Present correct and efcient algorithms to convert between the following
Lecture 8: Mergesort / Quicksort Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Given an array-based heap on n elements and a real number x, efciently d
Lecture 12: Depth-First Search Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Prove that in a breadth-rst search on a undirected graph G, every edge in
Lecture 9: Linear Sorting Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
The nuts and bolts problem is dened as follows. You are given a collection of n
Lecture 10: Graph Data Structures Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Sort Yourselves
Sort yourselves in alphabetical order so I can return the midterms efcient
Lecture 16: Introduction to Dynamic Programming Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 http:/www.cs.sunysb.edu/skiena
Problem of the Day
Multisets are allowed to have repeated elements. A multis