p193-lu - From Region Encoding To Extended Dewey: On...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching Jiaheng Lu Tok Wang Ling Chee-Yong Chan Ting Chen Department of Computer Science National University of Singapore { lujiahen,lingtw,chancy,chent } @comp.nus.edu.sg Abstract Finding all the occurrences of a twig pattern in an XML database is a core operation for ef- ficient evaluation of XML queries. A number of algorithms have been proposed to process a twig query based on region encoding label- ing scheme. While region encoding supports efficient determination of structural relation- ship between two elements, we observe that the information within a single label is very limited . In this paper, we propose a new label- ing scheme, called extended Dewey . This is a powerful labeling scheme, since from the label of an element alone, we can derive all the ele- ments names along the path from the root to the element. Based on extended Dewey , we de- sign a novel holistic twig join algorithm, called TJFast . Unlike all previous algorithms based on region encoding, to answer a twig query, TJFast only needs to access the labels of the leaf query nodes. Through this, not only do we reduce disk access, but we also support the efficient evaluation of queries with wildcards in branching nodes, which is very difficult to be answered by algorithms based on region en- coding. Finally, we report our experimental results to show that our algorithms are su- perior to previous approaches in terms of the number of elements scanned , the size of inter- mediate results and query performance . Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005 1 Introduction With the increasing popularity of XML for data rep- resentation, there is a lot of interest in query process- ing over data that conforms to a tree- structured data model. Queries on XML data are commonly ex- pressed in the form of tree patterns (or twig patterns), which represent a very useful subset of XPath and XQuery. Efficiently finding all twig pattern matches in an XML database is a major concern of XML query processing. In the past few years, many algorithms ([3],[6],[11],[10]) have been proposed to match such twig patterns. These approaches (i) first develop a labeling scheme to capture the structural information of XML documents, and then (ii) perform twig pattern matching based on the labels alone without traversing the original XML documents....
View Full Document

This note was uploaded on 03/01/2010 for the course ICT ... taught by Professor ... during the Three '10 term at University of Sydney.

Page1 / 12

p193-lu - From Region Encoding To Extended Dewey: On...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online