Chapter 6 - Chapter 6. Founda0ons of Business...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 6. Founda0ons of Business Intelligence: Databases and Informa0on Management Part 1. Tradi0onal Data Management: A File ­Oriented Approach Learning Objec0ves •  Describe how the problems of managing data resources in a tradi0onal file environment are solved by a database management system •  Describe the capabili0es and value of a database management system •  Apply important database design principles •  Evaluate tools and technologies for accessing informa0on from databases to improve business performance and decision making •  Assess the role of informa0on policy, data administra0on, and data quality assurance in the management of firm’s data resources Produced by Dr. Brian Janz 1 File Organiza0on Concepts •  Computer system organizes data in a hierarchy –  Field: Group of characters as word(s) or number –  Record: Group of related fields –  File: Group of records of same type –  Database: Group of related files •  Record: Describes an en0ty •  En0ty: Person, place, thing on which we store informa0on –  ATribute: Each characteris0c, or quality, describing en0ty, e.g., ATributes Date or Grade belong to en0ty COURSE The Data Hierarchy Figure 6-1 Produced by Dr. Brian Janz 2 Tradi0onal File Processing Figure 6-2 Problems with the Tradi0onal File Environment •  Data redundancy and inconsistency –  Data redundancy: Presence of duplicate data in mul0ple files –  Data inconsistency: Same aTribute has different values •  Program ­data dependence: –  When changes in program requires changes to data accessed by program •  Lack of flexibility •  Poor security •  Lack of data sharing and availability Produced by Dr. Brian Janz 3 Chapter 6. Founda0ons of Business Intelligence: Databases and Informa0on Management Part 2. The Database Approach to Data Management Databases and DBMS’s •  Database –  Collec0on of data organized to serve many applica0ons by centralizing data and controlling redundant data –  Interfaces between applica0on programs and physical data files –  Separates logical and physical views of data –  Solves problems of tradi0onal file environment •  Controls redundancy •  Eliminates inconsistency •  Uncouples programs and data •  Enables organiza0on to central manage data and data security •  Database management system Produced by Dr. Brian Janz 4 Human Resources Database with Mul0ple Views Figure 6-3 The Rela0onal DBMS •  Represent data as two ­dimensional tables called rela0ons or files •  Each table contains data on en0ty and aTributes •  Table: grid of columns and rows –  Rows (tuples): Records for different en00es –  Fields (columns): Represents aTribute for en0ty –  Key field: Field used to uniquely iden0fy each record –  Primary key: Field in table used for key fields –  Foreign key: Primary key used in second table as look ­ up field to iden0fy records from original table Produced by Dr. Brian Janz 5 Rela0onal Database Tables Figure 6-4A Rela0onal Database Tables (cont.) Figure 6-4B Produced by Dr. Brian Janz 6 Opera0ons of a Rela0onal DBMS •  Three basic opera0ons used to develop useful sets of data – SELECT: Creates subset of data of all records that meet stated criteria – JOIN: Combines rela0onal tables to provide user with more informa0on than available in individual tables – PROJECT: Creates subset of columns in table, crea0ng tables with only the informa0on specified The Three Basic Opera0ons of a Rela0onal DBMS The select, project, and join operations enable data from two different tables to be combined and only selected attributes to be displayed. Figure 6-5 Produced by Dr. Brian Janz 7 Capabili0es of Database Management Systems •  Data defini0on capability: Specifies structure of database content, used to create tables and define characteris0cs of fields •  Data dic0onary: Automated or manual file storing defini0ons of data elements and their characteris0cs •  Data manipula0on language: Used to add, change, delete, retrieve data from database –  Structured Query Language (SQL) –  Microsob Access user tools for genera0on SQL •  Many DBMS have report genera0on capabili0es for crea0ng polished reports (Crystal Reports) Example of an SQL Query Illustrated here are the SQL statements for a query to select suppliers for parts 137 or 150. They produce a list with the same results as Figure 6-5. Figure 6-7 Produced by Dr. Brian Janz 8 A Microsob Access Query Illustrated here is how the query in Figure 6-7 would be constructed using query-building tools in the Access Query Design View. It shows the tables, fields, and selection criteria used for the query. Figure 6-8 Designing Databases •  Conceptual (logical) design: abstract model from business perspec0ve •  Physical design: How database is arranged on direct ­access storage devices •  Design process iden0fies –  Rela0onships among data elements, redundant database elements –  Most efficient way to group data elements to meet business requirements, needs of applica0on programs –  Streamlining complex groupings of data to minimize redundant data elements and awkward many ­to ­many rela0onships •  Normaliza0on Produced by Dr. Brian Janz 9 An unnormalized relation contains repeating groups. For example, there can be many parts and suppliers for each order. There is only a one-to-one correspondence between Order_Number and Order_Date. Figure 6-9 Normalized Tables Created from Order After normalization, the original relation ORDER has been broken down into four smaller relations. The relation ORDER is left with only two attributes and the relation LINE_ITEM has a combined, or concatenated, key consisting of Order_Number and Part_Number. Figure 6-10 Produced by Dr. Brian Janz 10 Designing Databases (cont’d) •  En0ty ­rela0onship diagram –  Used by database designers to document the data model –  Illustrates rela0onships between en00es •  Distribu0ng databases: Storing database in more than one place –  Par00oned: Separate loca0ons store different parts of database –  Replicated: Central database duplicated in en0rety at different loca0ons –  Advantages •  Reduced vulnerability •  Increased responsiveness –  Drawbacks •  Departures from using standard defini0ons •  Security problems An En0ty ­Rela0onship Diagram This diagram shows the relationships between the entities ORDER, LINE_ITEM, PART, and SUPPLIER that might be used to model the database in Figure 6-10. Figure 6-11 Produced by Dr. Brian Janz 11 Distributed Databases Figure 6-12 There are alternative ways of distributing a database. The central database can be partitioned (a) so that each remote processor has the necessary data to serve its own local needs. The central database also can be replicated (b) at all remote locations. Chapter 6. Founda0ons of Business Intelligence: Databases and Informa0on Management Part 3. Using Databases to Improve Business Performance and Decision Making Produced by Dr. Brian Janz 12 Using Databases to Improve Business Performance and Decision Making •  Very large databases and systems require special capabili0es, tools – To analyze large quan00es of data – To access data from mul0ple systems •  Three key techniques – Data warehousing – Data mining – Tools for accessing internal databases through the Web Using Databases to Improve Business Performance and Decision Making •  Data warehouse: –  Stores current and historical data from many core opera0onal transac0on systems –  Consolidates and standardizes informa0on for use across enterprise, but data cannot be altered –  Data warehouse system will provide query, analysis, and repor0ng tools –  Subset of data warehouse –  Summarized or highly focused por0on of firm’s data for use by specific popula0on of users –  Typically focuses on single subject or line of business •  Data marts: Produced by Dr. Brian Janz 13 Components of a Data Warehouse Figure 6-13 Using Databases to Improve Business Performance and Decision Making: Business Intelligence •  Tools for consolida0ng, analyzing, and providing access to vast amounts of data to help users make beTer business decisions •  E.g., Harrah’s Entertainment analyzes customers to develop gambling profiles and iden0fy most profitable customers •  Principle tools include: –  Sobware for database query and repor0ng –  Online analy0cal processing (OLAP) –  Data mining Produced by Dr. Brian Janz 14 Online Analy0cal Processing (OLAP) •  Supports mul0dimensional data analysis – Viewing data using mul0ple dimensions – Each aspect of informa0on (product, pricing, cost, region, 0me period) is different dimension – E.g., how many washers sold in East in June compared with other regions? •  OLAP enables rapid, online answers to ad hoc queries Mul0dimensional Data Model: An OLAP “Data Cube” Figure 6-15 Produced by Dr. Brian Janz 15 Using Databases to Improve Business Performance and Decision Making: Data Mining •  More discovery driven than OLAP •  Finds hidden paTerns, rela0onships in large databases and infers rules to predict future behavior, e.g., Finding paTerns in customer data for one ­to ­one marke0ng campaigns or to iden0fy profitable customers. •  Types of informa0on obtainable from data mining –  Associa0ons –  Sequences –  Classifica0on –  Clustering –  Forecas0ng Using Databases to Improve Business Performance and Decision Making: Data Mining (cont’d) •  Predic0ve analysis –  Uses data mining techniques, historical data, and assump0ons about future condi0ons to predict outcomes of events –  E.g., Probability a customer will respond to an offer or purchase a specific product •  Text mining –  Extracts key elements from large unstructured data sets (e.g., stored e ­mails) Produced by Dr. Brian Janz 16 Using Databases to Improve Business Performance and Decision Making: Data Mining (cont’d) •  Web mining –  Discovery and analysis of useful paTerns and informa0on from WWW, e.g., to understand customer behavior, evaluate effec0veness of Web site, etc. •  Techniques: –  Web content mining: knowledge extracted from content of Web pages –  Web structure mining, e.g., links to/from Web page –  Web usage mining: user interac0on data recorded by Web server Databases and the Web •  Many companies use the Web to make some internal databases available to customers or partners •  Typical configura0on includes: –  Web server –  Applica0on server/middleware/CGI scripts –  Database server (hos0ng DBM) •  Advantages of using Web for database access: –  Ease of use of browser sobware –  Web interface requires few or no changes to database –  Inexpensive to add Web interface to system Produced by Dr. Brian Janz 17 Managing Data Resources: Establishing an Informa0on Policy •  Firm’s rules, procedures, roles for sharing, managing, standardizing data –  E.g., What employees are responsible for upda0ng sensi0ve employee informa0on •  Data administra0on: Firm func0on responsible for specific policies and procedures to manage data •  Data governance: Policies and processes for managing availability, usability, integrity, and security of enterprise data, especially as it relates to government regula0ons •  Database administra0on : Defining, organizing, implemen0ng, maintaining database; performed by database design and management group Managing Data Resources: Ensuring Data Quality •  More than 25% of cri0cal data in Fortune 1000 company databases are inaccurate or incomplete •  Most data quality problems stem from faulty input •  Before new database in place, need to: – Iden0fy and correct faulty data – Establish beTer rou0nes for edi0ng data once database in opera0on Produced by Dr. Brian Janz 18 Managing Data Resources: Ensuring Data Quality (cont’d) •  Data quality audit: –  Structured survey of the accuracy and level of completeness of the data in an informa0on system •  Survey samples from data files, or •  Survey end users for percep0ons of quality •  Data cleansing –  Sobware to detect and correct data that are incorrect, incomplete, improperly formaTed, or redundant –  Enforces consistency among different sets of data from separate informa0on systems Produced by Dr. Brian Janz 19 ...
View Full Document

This note was uploaded on 02/24/2011 for the course MIS 7650 taught by Professor Janz during the Spring '11 term at U. Memphis.

Ask a homework question - tutors are online