Lecture #2 - Data Warehouse Architecture - SS ZG515: Data...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
SS ZG515: Data Warehousing Lecture Note # 02 The data in a data warehouse comes from operational systems of the organization as well as from other external sources. These are collectively referred to as source systems . The data extracted from source systems is stored in a area called data staging area , where the data is cleaned, transformed , combined, deduplicated to prepare the data for us in the data warehouse. The data staging area is generally a collection of machines where simple activities like sorting and sequential processing takes place. The data staging area does not provide any query or presentation services. As soon as a system provides query or presentation services, it is categorized as a presentation server . A presentation server is the target machine on which the data is loaded from the data staging area organized and stored for direct querying by end users, report writers and other applications. The three different kinds of systems that are required for a data warehouse are: 1. Source Systems 2. Data Staging Area 3. Presentation servers The data travels from source systems to presentation servers via the data staging area. The entire process is popularly known as ETL (extract, transform, and load) or ETT (extract, transform, and transfer). Oracle’s ETL tool is called Oracle Warehouse Builder (OWB) and MS SQL Server’s ETL tool is called Data Transformation Services (DTS). A typical architecture of a data warehouse is shown below: Dr. Navneet Goyal, BITS, Pilani Page 1 of 6 META  QUERY MANAGER OPERATIONAL  SOURCE LOAD MANAGER WAREHOUSE MANAGER DETAILED DATA LIGHTLY SUMMERIZED  DATA HIGHLY  SUMMERIZED DATA ARCHIVE / BACK UP  END USER  ACCESS TOOLS
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
SS ZG515: Data Warehousing Each component and the tasks performed by them are explained below: 1. OPERATIONAL DATA The sources of data for the data warehouse is supplied from: (i) The data from the mainframe  systems in the traditional network and  hierarchical format. (ii) Data can also come from the relational DBMS like Oracle, Informix. (iii) In addition to these internal data, operational data also includes external  data obtained from commercial databases and databases associated with  supplier and customers. 2. LOAD MANAGER The load manager performs all the operations associated with extraction and loading data  into the data warehouse. These operations include simple transformations of the data to  Dr. Navneet Goyal, BITS, Pilani Page 2 of 6
Background image of page 2
SS ZG515: Data Warehousing prepare the data for entry into the warehouse. The size and complexity of this component  will vary between data warehouses and may be constructed using a combination of  vendor data loading tools and custom built programs. 3.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/21/2010 for the course CSIS SS G515 taught by Professor Prof.yash during the Winter '10 term at Birla Institute of Technology & Science, Pilani - Hyderabad.

Page1 / 6

Lecture #2 - Data Warehouse Architecture - SS ZG515: Data...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online