s10-structure-db-xml

s10-structure-db-xml - Structure A generic web page...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Slides adapted from Rao (ASU) & Franklin Structure How will search and querying on these three types of data differ? A generic web page containing text A movie review [English] [SQL] [XML] S e m i - t r u c d An employee record
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Slides adapted from Rao (ASU) & Franklin Structure helps querying Expressive queries Give me all pages that have key words “Get Rich Quick” Give me the social security numbers of all the employees who have stayed with the company for more than 5 years, and whose yearly salaries are three standard deviations away from the average salary Give me all mails from people from ASU written this year, which are relevant to “get rich quick” Challenges in Exploiting Structure Languages for specifying “Semi-structured” data Standards for supporting/exploiting semantic tagging Techniques for extracting information (NLP-lite) keyword SQL XML
Background image of page 2
Topic 3: Finding, Representing & Exploiting Structure Getting Structure: Allow structure specification languages  XML? [More structured than text and less structured than databases] If structure is not explicitly specified (or is obfuscated), can we extract it? Wrapper generation/Information Extraction Using Structure: For retrieval: Extend IR techniques to use the additional structure For query processing: (Joins/Aggregations etc) Extend database techniques to use the partial structure For reasoning with structured knowledge Semantic web ideas. . Structure in the context of multiple sources: How to align structure
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Asambaadham badhyato maanavaanam yasya udvatah pravatah samam bahu Naanaaveeryaa oshadheeryaa bibharti pruthivee nah prathataam raadhyataam nah Earth which has many heights, and slopes and the unconfined plain that bind men together, Earth that bears plants of various healing powers, may she spread wide for us and thrive - Bhoomi Sooktam Atharva Veda XII.I (4/22, 12th Century B.C.; Iron Age) ? ? ? ? ? ? ?   ? ? ? ? ? ?   ? ? ? ? ? ? ? ?   ? ? ? ?    ? ? ? ? ? ? ?   ? ? ? ? ? ?   ? ? ?   ? ? ?    ? ? ? ? ? ? ? ? ? ?   ? ? ? ? ? ? ? ? ?   ? ? ? ? ? ? ?    ? ? ? ? ? ?   ? ?   ? ? ? ? ? ? ? ?   ? ? ? ? ? ? ? ?   ? ?  
Background image of page 4
Slides adapted from Rao (ASU) & Franklin Adapting old disciplines for Web-age Information (text) retrieval Scale of the web Hyper text/ Link structure Authority/hub computations Databases Multiple databases Heterogeneous, access limited, partially overlapping Network (un)reliability Datamining [Machine Learning/Statistics/Databases] Learning patterns from large scale data
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Slides adapted from Rao (ASU) & Franklin Why do we care about databases? Three reasons Deep web is all databases… We can do better with structured data… Exposing databases on web changes their clientele. .
Background image of page 6
Slides adapted from Rao (ASU) & Franklin Deep Web is databases. . The crawlable web pages are just the tip of a huge ice berg that is deep web Many web sites have huge backend databases that generate pages dynamically in response to queries Airline fare databases; News paper classifieds etc.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/11/2012 for the course CSE 494 taught by Professor Rao during the Spring '08 term at ASU.

Page1 / 97

s10-structure-db-xml - Structure A generic web page...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online