09-dwdm - Announcements (Thu. Sep. 29) Data Warehousing and...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Data Warehousing and Data Mining CPS 116 Introduction to Database Systems 2 Announcements (Thu. Sep. 29) ± Homework #2 due next Tuesday ² Sample solution available next Wednesday ± Midterm exam next Thursday in class ² Open book, open notes ² Sample midterm solution (from 2009) available today • Sample midterm (2009) was handed out on Tuesday ± Part of the lecture next Tuesday will be reserved for midterm review ² Feel free to bring your questions 3 Data integration ± Data resides in many distributed, heterogeneous OLTP (On-Line Transaction Processing) sources ² Sales, inventory, customer, … ² NC branch, NY branch, CA branch, … ± Need to support OLAP (On-Line Analytical Processing) over an integrated view of the data ± Possible approaches to integration ² Eager: integrate in advance and store the integrated data at a central repository called the data warehouse ² Lazy: integrate on demand; process queries over distributed sources—mediated or federated systems 4 OLTP versus OLAP OLTP ± Mostly updates ± Short, simple transactions ± Clerical users ± Goal: transaction throughput OLAP ± Mostly reads ± Long, complex queries ± Analysts, decision makers ± Goal: fast queries Implications on database design and optimization? OLAP databases do not care much about redundancy ² “Denormalize” tables ² Many, many indexes ² Precomputed query results 5 Eager versus lazy integration Eager (warehousing) ± In advance: before queries ± Copy data from sources Lazy ± On demand: at query time ± Leave data at sources ) Answer could be stale ) Need to maintain consistency ) Query processing is local to the warehouse ² Faster ² Can operate when sources are unavailable ) Answer is more up-to-date ) No need to maintain consistency ) Sources participate in query processing ² Slower ² Interferes with local processing 6 Maintaining a data warehouse ± The “ETL” process ² Extraction: extract relevant data and/or changes from sources ² Transformation: transform data to match the warehouse schema ² Loading: integrate data/changes into the warehouse ± Approaches ² Recomputation • Easy to implement; just take periodic dumps of the sources, say, every night • What if there is no “night,” e.g., a global organization?
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 01/17/2012.

Page1 / 5

09-dwdm - Announcements (Thu. Sep. 29) Data Warehousing and...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online