integration-2 - Information Integration Mediators...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Information Integration Mediators Warehousing Distributed Databases Slides are modified from Dr. Ullman’s notes. 1 Homework 4 x Due: April 28, 2:00 am via dropbox x Views:  Define the following: Virtual views, Materialized views, Updatable views, Not updatable views  Give the SQL statement syntax to create and remove views 2 Homework 4 x Index:  What are the factors influencing the selection of indexes?  What are the advantages and disadvantages of sparse index? 3 Example Applications 1. Enterprise Information Integration: making separate DB’s, all owned by one company, work together. 2. Scientific DB’s, e.g., genome DB’s. 3. Catalog integration: combining product information from all your suppliers. 4 Challenges 1. Legacy databases : DB’s get used for many applications. x You can’t change its structure for the sake of one application, because it will cause others to break. 1. Incompatibilities : Two, supposedly similar databases, will mismatch in many ways. 5 Examples: Incompatibilities x Lexical : addr in one DB is address in another. x Value mismatches : is a “red” car the same color in each DB? Is 20 degrees Fahrenheit or Centigrade? x Semantic : are “employees” in each database the same? What about consultants? Retirees? Contractors? 6 What Do You Do About It? x Grubby, handwritten translation at each interface.  Some research on automatic inference of relationships. x Data sharing x Wrapper (aka “adapter”) translates incoming queries and outgoing answers. 7 Data Sharing x Goal:  Keep local databases unchanged  Send/receive/merge data between local databases Objectoriented DB Obj. O Relational DB Attribute A, B of entity E Use semistructured format to share data 8 Semistructured Data bars.xml BARS BAR PRICE 2.50 name = ”JoesBar” theBeer = ”Bud” BEER PRICE 3.00 theBeer = ”Miller” name = ”Bud” SoldBy = ”…” Rose =document Green = element Gold = attribute Purple = primitive value 9 Example Document An element node <BARS> <BAR name = ”JoesBar”> <PRICE theBeer = ”Bud”>2.50</PRICE> <PRICE theBeer = ”Miller”>3.00</PRICE> </BAR> … <BEER name = ”Bud” soldBy = ”JoesBar SuesBar … ”/> … An attribute node </BARS> Document node is all of this, plus the header ( <? xml version… ). 10 10 Paths in XML Documents x XPath is a language for describing paths in XML documents. x The result of the described path is a sequence of items. 11 11 Path Expressions x Simple path expressions are sequences of slashes (/) and tags, starting with /.  Example: /BARS/BAR/PRICE x Construct the result by starting with just the doc node and processing each tag from the left. 12 12 XQuery x XQuery extends XPath to a query language that has power similar to SQL. x Uses the same sequence­of­items data model. x XQuery is an expression language.  Like relational algebra ­­­ any XQuery expression can be an argument of any other XQuery expression. 13 13 DTD for Running Example <!DOCTYPE BARS [ <!ELEMENT BARS (BAR*, BEER*)> <!ELEMENT BAR (PRICE+)> <!ATTLIST BAR name ID #REQUIRED> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST PRICE theBeer IDREF #REQUIRED> <!ELEMENT BEER EMPTY> <!ATTLIST BEER name ID #REQUIRED> <!ATTLIST BEER soldBy IDREFS #IMPLIED>> 14 14 XML Schema x It is an XML document x Namespace http://www.w3.org/2001/XMLSchema x Element, attributes x Keys and foreign key constraints 15 DATA INTEGRATION 16 Integration Architectures 1. Federation : everybody talks directly to everyone else. 2. Warehouse : Sources are translated from their local schema to a global schema and copied to a central DB. 3. Mediator : Virtual warehouse ­­­ turns a user query into a sequence of source queries. 17 Federations Wrapper Wrapper Wrapper n(n-1) interface Wrapper Wrapper Wrapper 18 Warehouse Diagram Warehouse Updates? Wrapper Wrapper Source 1 Source 2 19 A Mediator Result User query Mediator Query Wrapper Query Result Result Source 1 Query Result Wrapper Query Result Source 2 20 Two Mediation Approaches 1. Global as View : Mediator processes queries into steps executed at sources. 2. Local as View : Sources are defined in terms of global relations; mediator finds all ways to build query from views. 21 Data Mining x Discover pattern (knowledge discovery) x Statistical data analysis x DM applications  Decision tree  Clustering 22 Distributed Databases x Transaction processing involves multiple databases that are distributed  How do we know that transaction is completed?  What is a distributed ACID transaction?  How can we deal with failures?  What is replicated, what is not?  Serializability? 23 Distributed Transactions x Need communication between distributed components x Coordinator: distributes transaction components and handles commit/abort statements x Supporting distributed atomicity  Two­phase commit 24 Replicated Databases x Increase availability x How to do update?  Optimistic replica control • Availability always, resolve conflicts later  Pessimistic replica control • Consistency always, may loose availability 25 ...
View Full Document

This note was uploaded on 12/13/2011 for the course CSCE 520 taught by Professor Farkas during the Spring '11 term at South Carolina.

Page1 / 25

integration-2 - Information Integration Mediators...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online