VirtualDatabases

VirtualDatabases - CS345 Data Mining Virtual Databases...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
CS345 Data Mining Virtual Databases
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Example b Find marketing manager openings in Internet companies so that my commute is shorter than 10 miles. Web Structured queries e.g., in SQL Virtual Relations
Background image of page 2
Applications b Comparison shopping s shopping.com, fatlens, mobissimo,… b Job search s indeed.com, simplyhired,… b Classifieds Search s oodle b Integrating web data with relational enterprise apps s purchasing, pricing,…
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Wrappers b Extract tuples from a single website b Assume website is a static collection of pages i.e., no forms Website Wrapper Relation
Background image of page 4
Not same as Relation Extraction b Why can’t we use DIPRE or Snowball? s Can’t assume that the same tuple can be found on many different websites s Need to extract all the tuples from each website s May need to normalize data values across websites s Data may be behind forms b Need to account for query capabilities of websites
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Brute force approach b Write a custom program tailored to the website s e.g., in perl, python,… b Does not scale to thousands of websites s Each site needs a different wrapper b Website changes break wrappers
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/31/2011 for the course CS 345 taught by Professor Dunbar,a during the Fall '07 term at UC Davis.

Page1 / 15

VirtualDatabases - CS345 Data Mining Virtual Databases...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online