This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: BCB 567/CprE 548 Fall 2007 Homework 3 Solutions 1. 2. To solve this problem, we will show that the number of unique substrings of S cor responds to the summation of the lengths of all edge labels in the suffix tree of S , excluding the character $. For all of the reasoning below, ignore the $ symbol. Let be the path label from the root to some internal node u and be the edge label from u to v , where v is a child of u . Denote each distinct prefix of as i . The edge label from u to v corresponds to   strings, 1 , 2 ,...   . Each of these strings corresponds to a distinct substring of S , starting at some position j , where suff j is in the subtree rooted at v . We notice that each i corresponds to a unique path from the root. Moreover, given some substring of S starting at position t , there exists some nodes u and v , such that = i for some i . This is because the path labeled must exist in the graph, as this is a property of suffix trees. Therefore, our algorithm for counting the number of distinct suffixes of S must simply add together the string lengths of all edges in the suffix array, discounting the $. This can be done using a tree traversal, which takes O ( E ) time, where E is the number of edges. In the case of the suffix tree, O ( E ) = O ( n ). 3. (5 points) (a) Because we can find the child of interest in constant time, the total time needed to search for the pattern P is O (  P  ) time. However, since we require O (   ) space per node, the total space requirement is O (   n )....
View
Full
Document
This note was uploaded on 10/01/2009 for the course CS BCB/Co taught by Professor Olivereulenstein during the Fall '06 term at Iowa State.
 Fall '06
 OLIVEREULENSTEIN

Click to edit the document details