5 - Extraction Transformation Loading...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Warehousing  (SS ZG515) E xtraction – T ransformation - L oading Prof. Navneet Goyal/Vikas Singh Computer Science Department BITS, Pilani
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
03/15/10 To discuss… 1. Requirements 2. Data Structures 3. Extraction 4. 5. Delivering Dimension Tables 6. Delivering Fact Tables
Background image of page 2
1. Requirements
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
03/15/10 ETL A Properly designed ETL system extracts data from the source systems,  enforces data quality and consistency standards, conforms data so that  separate sources can be used together, and finally delivers data in a  presentation-ready format so that application developers can build  applications and end users can make decisions… ETL makes or breaks the  data warehouse…”   Ralph Kimball  
Background image of page 4
03/15/10 Requirements Business Needs Information requirements of the end user. Captured by interview with users, independent investigations about the  possible sources by the ETL team. Compliance Requirements Sarbanes-Oxley Act 2002 (Deals with the regulation of corporate  governance) (more on  http://www.soxlaw.com ) Proof of complete transaction flow that changed any data.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
03/15/10 Requirements  (contd…) Data Profiling Systematic examination of quality, scope and the context of a data  source Helps ETL team determine how much data cleaning activity require. “[Data Profiling] employs analytic methods for looking at data for the  purpose of developing a thorough understanding of the content,  structure and the quality of the data. A good data profiling [system]  can process very large amounts of data, with the skills of analyst,  uncover all sorts of issues that need to be addressed”  Jack Olson
Background image of page 6
03/15/10 Requirements  (contd…) Security Requirements ETL team have complete read/ write access to the entire corporate  data. ETL workstations on the company intranet, A major threat. Keep it in a  separate subnet with packet filtering gateway. Secure Backups, as well. Data Integration Identified as conform step Conform Dimensions & Conform Facts
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Requirements  (contd…) Data Latency How quickly the data can be delivered to the users? ETL architecture has direct impact on it. Stage data after each major transformations, Not just after all the four  Each archived/ staged data set should have accompanying metadata. Tracking this lineage is explicitly required be certain compliance 
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/14/2010 for the course CSE SS ZG515 taught by Professor Naveneetgoyal during the Summer '10 term at Birla Institute of Technology & Science.

Page1 / 65

5 - Extraction Transformation Loading...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online