This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: BCB 567/CprE 548 Fall 2007 Homework 3 Solutions 1. 2. To solve this problem, we will show that the number of unique substrings of S cor responds to the summation of the lengths of all edge labels in the suffix tree of S , excluding the character $. For all of the reasoning below, ignore the $ symbol. Let α be the path label from the root to some internal node u and β be the edge label from u to v , where v is a child of u . Denote each distinct prefix of β as β i . The edge label from u to v corresponds to  β  strings, αβ 1 ,αβ 2 ,...αβ  β  . Each of these strings corresponds to a distinct substring of S , starting at some position j , where suff j is in the subtree rooted at v . We notice that each αβ i corresponds to a unique path from the root. Moreover, given some substring γ of S starting at position t , there exists some nodes u and v , such that γ = αβ i for some i . This is because the path labeled γ must exist in the graph, as this is a property of suffix trees. Therefore, our algorithm for counting the number of distinct suffixes of S must simply add together the string lengths of all edges in the suffix array, discounting the $. This can be done using a tree traversal, which takes O ( E ) time, where E is the number of edges. In the case of the suffix tree, O ( E ) = O ( n ). 3. (5 points) (a) Because we can find the child of interest in constant time, the total time needed to search for the pattern P is O (  P  ) time. However, since we require O (  Σ  ) space per node, the total space requirement is O (  Σ  n )....
View
Full Document
 Fall '06
 OLIVEREULENSTEIN
 Graph Theory, Suffix tree

Click to edit the document details