This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CSE-5120-Fall-2009 Data on the Web: XML XML (Extensible Markup Language) • A W3C standard to complement HTML • HTML describes presentation XML describes content • Useful for large scale electronic publishing and for data exchange on the web. XML provides a way to mark up a document with meaningful tags introduce ”semi-structure” to the document. • Elements: also called tags primary building blocks < FIRSTNAME > Richard < /FIRSTNAME > Case sensitive: BOOK different from Book Can be nested • Attributes: provide additional information about the elements e.g. < BOOK FORMAT=”Hardcover” > ... < /BOOK > Document Type Declaration (DTDs) Set of rules to specify what tags are allowed, the ordering they can appear, and how they can be nested. A DTD: < !DOCTYPE booklist[ < !ELEMENT booklist ( book *) > < !ELEMENT book ( booktitle, author+, published?) > < !ELEMENT author ( name, address ) > < !ATTLIST author id ID REQUIRED > < !ELEMENT name(firstname?, lastname) > < !ELEMENT firstname(PCDATA) > < !ELEMENT lastname(PCDATA) > < !ELEMENT address ANY > < !ELEMENT published(PCDATA) > < !ATTLIST book format ( Paperback | Hardcover )”Paperback” > ] > In the attribute of book format, the default value is ”Paperback”. An XML document conforming to the DTD: < booklist > < book format=”Paperback” > < booktitle > The Selfish Gene < /booktitle > < author id = “dawkins” > < name > < firstname > Richard < /firstname > < lastname > Dawkins < /lastname > < /name > < address > < city > Timbuktu < /city > < /address > < /author > < /published > 1980 < /published > < /book > < /booklist > • * set with zero or more elements • + set with one or more elements • ? optional • | or 65 Data graph: FIRST NAME BOOK AUTHOR LAST NAME TITLE FIRST NAME BOOK AUTHOR LAST NAME TITLE PUBLISHED PUBLISHED 2 3 4 5 6 7 8 9 10 11 12 13 1 BOOKLIST J.R. Tolkien 1965 The Lord of the rings J. K. Rowling 1998 Harry Potter and the Philosopher’s Stone Comparison with Relational Data 227 George student ID name 339 Matthew Mary 233 339 233 George Mary ID 227 name Matthew ID name ID name student student student Class Issues in XML Data • Storage • Indexing for efficient querying • query optimization • security: accessibility control Querying XML Data : XQuery XQuery is the W3C standard query language for XML (See http://www.w3.org/TR/xquery/) FOR $ l IN doc(www.ourbookstore.com/books.xml)// AUTHOR/LASTNAME RETURN < RESULT > $ l < /RESULT > doc(www.ourbookstore.com/books.xml)// AUTHOR/LASTNAME is a path expression . It specifies a path with 3 entities: the document itself, the AUTHOR elements and the LASTNAME elements. // says that AUTHOR can be nested any- where within the document / – LASTNAME must be nested immediately under the AUTHOR element....
View Full Document
This note was uploaded on 04/23/2010 for the course CSC CSC5120 taught by Professor Adafu during the Fall '09 term at CUHK.
- Fall '09