DW Lecture III (2)

DW Lecture III (2) - Data ware housing and Busine I nte...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
By Dr. Atanu Rakshit Data warehousing and Business Intelligence using SAS (Lecture III)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Course Overview The course: what and how 0. Introduction: The Past and The  Problem I. Data Warehousing II. Decision Support and OLAP III. Data Mining IV. Usage of SAS for DW and DM V.  Business Intelligence and its use VI. Looking Ahead
Background image of page 2
3 Components of the Warehouse Data Extraction and Cleansing Data Transformation and Metadata Data Integrity Data Loading Data Refreshing Structuring and Modeling Issues
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Loading the Warehouse Cleaning the data before it is loaded
Background image of page 4
5 Source Data Typically host based, legacy  applications Customized applications, COBOL,  3GL, 4GL Point of Contact Devices POS,  ATM, Call switches External Sources Nielsen’s, Acxiom, CMIE, Vendors,  Partners Sequential Legacy Relational External Operational/ Source Data
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Data Quality - The Reality Tempting to think creating a data  warehouse is simply extracting  operational data and entering into a data  warehouse Nothing could be farther from the truth Warehouse data comes from disparate   questionable sources
Background image of page 6
7 Data Quality - The Reality Legacy systems no longer documented Outside sources with questionable quality  procedures Production systems with no built in integrity  checks and no integration Operational systems are usually designed to solve a  specific business problem and are rarely developed  to  a corporate plan “And get it done quickly, we do not have time to worry about  corporate standards. ..”
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8 Data Integration Across Sources Trust Credit card Savings Loans Same data  different name Different data  Same name Data found here   nowhere else Different keys same data
Background image of page 8
9 Data Transformation Example encoding unit field appl A - balance appl B - bal appl C - currbal appl D - balcurr appl A - pipeline - cm appl B - pipeline - in appl C - pipeline - feet appl D - pipeline - yds appl A - m,f appl B - 1,0 appl C - x,y appl D - male, female Data Warehouse
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10 Data Integrity Problems Same person, different spellings Agarwal, Agrawal, Aggarwal etc. .. Multiple ways to denote company name Persistent Systems, PSPL, Persistent Pvt. LTD. Use of different names mumbai, bombay Different account numbers generated by different  applications for the same customer Required fields left blank Invalid product codes collected at point of sale manual entry leads to mistakes “in case of a problem use 9999999”
Background image of page 10
Dr. Atanu Rakshit Getting the Data In 1.
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 61

DW Lecture III (2) - Data ware housing and Busine I nte...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online