{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

06-Unit6 - Business Intelligence and Tools Unit 6 Unit 6...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Business Intelligence and Tools Unit 6 Sikkim Manipal University Page No.: 130 Unit 6 Data Extraction Structure 6.1 Introduction Objectives 6.2 ETL Overview 6.2.1 Significance of ETL Processes 6.2.2 ETL Requirements and Steps Self Assessment Question(s) (SAQs) 6.3 Overview of the Data Extraction Process Self Assessment Question(s) (SAQs) 6.4 Types of Data in Operational Systems 6.4.1 Current Value 6.4.2 Periodic Status Self Assessment Question(s) (SAQs) 6.5 Source Identification Self Assessment Question(s) (SAQs) 6.6 Data Extraction Techniques 6.6.1 Immediate Data Extraction 6.6.2 Deferred Data Extraction Self Assessment Question(s) (SAQs) 6.7 Evaluation of the techniques Self Assessment Question(s) (SAQs) 6.8 Summary 6.9 Terminal Questions (TQs) 6.10 Multiple Choice Questions (MCQs) 6.11 Answers to SAQs, TQs, and MCQs 6.11.1 Answers to Self Assessment Questions (SAQs) 6.11.2 Answers to Terminal Questions (TQs) 6.11.3 Answers to Multiple Choice Questions (MCQs)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Business Intelligence and Tools Unit 6 Sikkim Manipal University Page No.: 131 6.1 Introduction In this Unit, we discuss the data extraction process that is carried out from the sources systems into data warehouses or data marts. As we have already discussed, ‘data extraction’ is the first step in the execution of the ETL (Extraction, Transaction, and Loading) functions to build a data warehouse. This extraction can be done from an OLTP database and non- OLTP systems, such as text files, legacy systems, and spreadsheets. The data extraction process is complex in its nature because of the tremendous diversity that exists among the source systems in practice. Objectives: The objectives of the Unit are to make you understand: Source Identification for extraction of the data. Various methods being used fro data extraction. Evaluation of the extraction techniques. Exception handling in case some data has not been extracted properly. 6.2 ETL Overview Mostly the information contained in a data warehouse comes from the operational systems. But we all know that the operational systems could not be used to provide the strategic information. So you need to carefully understand what constitutes the difference between the data in the source operational systems and the information in the data warehouse. It is all ETL functions that reshape the relevant data from the source systems into useful information to be stored in the data warehouse. There would be no strategic information in a data warehouse in the absence of these functions. 6.2.1 Significance of ETL Processes The ETL functions act as the back-end processes that cover the extraction of the data from the source systems. Also, they include all the functions and procedures for changing the source data into the exact formats and structures appropriate for storage in the data warehouse database. After the transformation of the data, the processes include all processes that
Background image of page 2