This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: BCB 567/CprE 548 Fall 2007 Homework 3 Solutions 1. 2. To solve this problem, we will show that the number of unique substrings of S cor- responds to the summation of the lengths of all edge labels in the suffix tree of S , excluding the character $. For all of the reasoning below, ignore the $ symbol. Let α be the path label from the root to some internal node u and β be the edge label from u to v , where v is a child of u . Denote each distinct prefix of β as β i . The edge label from u to v corresponds to | β | strings, αβ 1 ,αβ 2 ,...αβ | β | . Each of these strings corresponds to a distinct substring of S , starting at some position j , where suff j is in the subtree rooted at v . We notice that each αβ i corresponds to a unique path from the root. Moreover, given some substring γ of S starting at position t , there exists some nodes u and v , such that γ = αβ i for some i . This is because the path labeled γ must exist in the graph, as this is a property of suffix trees. Therefore, our algorithm for counting the number of distinct suffixes of S must simply add together the string lengths of all edges in the suffix array, discounting the $. This can be done using a tree traversal, which takes O ( E ) time, where E is the number of edges. In the case of the suffix tree, O ( E ) = O ( n ). 3. (5 points) (a) Because we can find the child of interest in constant time, the total time needed to search for the pattern P is O ( | P | ) time. However, since we require O ( | Σ | ) space per node, the total space requirement is O ( | Σ | n )....
View Full Document
- Fall '06
- Graph Theory, Suffix tree