p85-bohannon - Information Preserving XML Schema Embedding...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Information Preserving XML Schema Embedding Philip Bohannon Wenfei Fan * Michael Flaster P. P. S. Narayan Bell Laboratories, Lucent Technologies { bohannon,mflaster,ppsn } @research.bell-labs.com University of Edinburgh & Bell Laboratories, wenfei@research.bell-labs.com Abstract A fundamental concern of information integration in an XML context is the ability to embed one or more source documents in a target document so that (a) the target document conforms to a tar- get schema and (b) the information in the source document(s) is preserved . In this paper, informa- tion preservation for XML is formally studied, and the results of this study guide the definition of a novel notion of schema embedding between two XML DTD schemas represented as graphs. Schema embedding generalizes the conventional notion of graph similarity by allowing an edge in a source DTD schema to be mapped to a path in the target DTD. Instance-level embeddings can be defined from the schema embedding in a straightforward manner, such that conformance to a target schema and information preservation are guaranteed. We show that it is NP-complete to find an embedding between two DTD schemas. We also provide ef- ficient heuristic algorithms to find candidate em- beddings, along with experimental results to eval- uate and compare the algorithms. These yield the first systematic and effective approach to finding information preserving XML mappings. 1 Introduction A central technical issue for the exchange, migration and integration of XML data is to find mappings from docu- ments of a source XML (DTD) schema to documents of a target schema. While one can certainly define XML map- pings in a query language such as XQuery or XSLT, such queries may be large and complex, and in practice it is of- ten needed that XML mappings (1) guarantee type-safety and (2) preserve information . * Supported in part by EPSRC GR/S63205/01, EPSRC GR/T27433/01 and NSFC 60228006. Permission to copy without fee all or part of this material is granted pro- vided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005 It is clearly desirable that the document produced by an XML mapping conforms to a target schema, guaranteeing type safety . But this may be difficult to check for mappings defined in XQuery or XSLT [4]. Further, since in many ap- plications one does not want to lose the original informa- tion of the source data, a mapping should also preserve in- formation. Criteria for information preservation include: (1) invertibility [16]: can one recover the source document from the target? and (2) query preservation : for a particular XML query language, can all queries on source documents...
View Full Document

This note was uploaded on 03/01/2010 for the course ICT ... taught by Professor ... during the Three '10 term at University of Sydney.

Page1 / 12

p85-bohannon - Information Preserving XML Schema Embedding...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online