This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: BCB 567/CprE 548 Fall 2007 Homework 3 Solutions 1. 2. To solve this problem, we will show that the number of unique substrings of S cor- responds to the summation of the lengths of all edge labels in the suffix tree of S , excluding the character $. For all of the reasoning below, ignore the $ symbol. Let be the path label from the root to some internal node u and be the edge label from u to v , where v is a child of u . Denote each distinct prefix of as i . The edge label from u to v corresponds to | | strings, 1 , 2 ,... | | . Each of these strings corresponds to a distinct substring of S , starting at some position j , where suff j is in the subtree rooted at v . We notice that each i corresponds to a unique path from the root. Moreover, given some substring of S starting at position t , there exists some nodes u and v , such that = i for some i . This is because the path labeled must exist in the graph, as this is a property of suffix trees. Therefore, our algorithm for counting the number of distinct suffixes of S must simply add together the string lengths of all edges in the suffix array, discounting the $. This can be done using a tree traversal, which takes O ( E ) time, where E is the number of edges. In the case of the suffix tree, O ( E ) = O ( n ). 3. (5 points) (a) Because we can find the child of interest in constant time, the total time needed to search for the pattern P is O ( | P | ) time. However, since we require O ( | | ) space per node, the total space requirement is O ( | | n )....
View Full Document
This note was uploaded on 10/01/2009 for the course CS BCB/Co taught by Professor Olivereulenstein during the Fall '06 term at Iowa State.
- Fall '06