75%(4)3 out of 4 people found this document helpful
This preview shows page 9 - 14 out of 41 pages.
oPush publishing – deliver BI without requestoPull publishing – requires user to request BI results°°Data warehouse and data marts. What they are and their differences?
°°Problems with operational dataDirty dataMissing valueInconsistent dataData not integratedWrong granularityoToo fineoNot fine enoughToo much dataoToo many attributesoToo many data points°°Reporting applicationsRFM (Recently frequently money)oAnalyze and rank customers according to their purchasing patternsoR – recent ordersoF – frequent ordersoM – money (amount) of money spentOLAPoOnline analytical processing
oProvides the ability to sum, count, average, and other simple arithmetic operations on groups of dataoDynamic hence the term ‘online’oDimension: a characteristic of a measure (e.g. purchase data, customer type, etc.)oMeasure: a data item of interest. It is the item to be processed(e.g. total sales, average sales, etc.)°°Data mining applicationsData mining is the application of statistical techniques to find patterns and relationships among data for classification and predictionBecame popular due to large amount of data produced in 15 years (a result of cheap hardware) A convergence of many disciplineTwo broad categories: unsupervised and supervisedTwo typical techniques: market basket analysis and decision trees°°Supervised and unsupervised data miningUnsupervised:oAnalyst does not start with a prior hypothesis or model oHypothesized model created based on analytical results (later)to explain patterns foundoExample: cluster analysisSupervised:oUses a priori model to compute outcome of modeloPrediction, such as regression analysis°
°Big data applicationsBig data is defined as:oHuge volume – petabyte and largeroRapid velocity – generated rapidlyoGreat varietyStructured data, free-form text, log files, graphics, audio, and video°°BI serverWeb server for publishing of BIMS SQL server mgr is most popular todayProvides two major functionsoManagement (metadata of users, etc.)odelivery°°KM, CMS and expert systemsKM (knowledge management)oEarliest KM system is expert systemCMoSupport management and delivery of documents, other expressions of employee knowledgeoChallenges of content managementDatabases are huge
Content dynamicDocuments do not exist in isolationContents are perishableIn many languages oIn-house custom developmentCustomer support department develops in house database applications to tract customers problemsoOff the shelfHorizontal market products (share point)Vertical market applicationoPublic search engineGoogle°°CHAPTER 10:°