354.XML - Database Systems I The Semistructured Data Model...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 1 Database Systems I The Semistructured Data Model
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 2 The Web Today HTML documents generated by humans or by applications, consumed by humans only, easy access: across platforms, across organizations. only layout, no semantic information Limited application interoperability HTML not understood by applications at most, some heuristic rules. Database technology SQL standard, but still lots of vendor specific
Background image of page 2
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 3 XML Data Exchange Format A standard from the W3C (World Wide Web Consortium, http://www.w3.org ). The mission of the W3C „. . . developing common protocols that promote its evolution and ensure its interoperability . . .“. Basic ideas XML = data XML generated by applications XML consumed by applications Easy access: across platforms, organizations.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 4 Paradigm Shift on the Web For web search engines: From documents (HTML) to data (XML) From document management to document understanding (e.g., question answering) From information retrieval to data management For database systems: From relational (structured) model to semistructured data From data processing to data /query translation From storage to transport
Background image of page 4
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 5 The Semistructured Data Model Developed by the DBS community to address the following, emerging issues Data sets with non-rigid structure Biological data sequence data, 3D data, text data . . . and their relationships Web data Integration of heterogeneous sources not only, but especially for Web data and biological data.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 6 The Semistructured Data Model Data is self-describing , i.e. the data description is integrated with the data itself rather than in a separate schema. Database is a collection of nodes and arcs (directed graph). Leaf nodes represent data of some atomic type ( atomic objects , such as numbers or strings). Interior nodes represent complex objects consisting of components (child nodes), connected by arcs to this node. Arcs are directed and connect two nodes.
Background image of page 6
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 7 The Semistructured Data Model Arc labels indicates the relationship between the two corresponding nodes. The root node is the only interior node without in-arcs, representing the entire database. All database objects are children of the root node. Every node must be reachable from the root. A general graph structure is possible, i.e. the graph need not be a tree structure.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8 Graphical Representation “Serge” “Abiteboul” 1997 “Victor” “Vianu” 122 133 paper book paper references references references author title year http author author
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 46

354.XML - Database Systems I The Semistructured Data Model...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online