Detailed description of how the data will actually be arranged and stored on

Detailed description of how the data will actually be

This preview shows page 12 - 16 out of 19 pages.

Detailed description of how the data will actually be arranged and stored on physical devices Non-relational Databases and Databases Non-relational databases: NoSQL More flexible data model Data sets stored across distributed machines Easier to scale Handle large volumes of unstructured and structured data Databases in the cloud Appeal to start-ups, smaller businesses Amazon Relational Database Service, Microsoft SQL Azure Private clouds Business Intelligence Infrastructure Array of tools for obtaining information from separate systems and from big data Data warehouse Stores current and historical data from many core operational transaction systems Consolidates and standardizes information for use across enterprise, but data cannot be altered Provides analysis and reporting tools Data marts Subset of data warehouses Summarized or highly focused portion of firm’s data for use by specific population of users Typically focuses on single subject or line of business Business Intelligence Infrastructure Contd. Hadoop 12
Image of page 12
Enables distributed parallel processing of big data across inexpensive computers Key services Hadoop Distributed File System (HDFS): data storage MapReduce: breaks data into clusters for work Hbase: NoSQL database Used Yahoo, NextBio Online Analytical Processing (OLAP) Supports multidimensional data analysis Viewing data using multiple dimensions Each aspect of information (product, pricing, cost, region, time period) is different dimension OLAP enables rapid, online answers to ad hoc queries Analytical Tools: Relationships, Patterns, Trends Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions Tools include Multidimensional data analysis (OLAP) Text mining Extracts key elements from large unstructured data sets Web mining Discovery and analysis of useful patterns and information from web Web content/structure/usage mining Sentiment analysis Mines text comments in e-mail, blog, social media conversation, or survey to detect favourable and unfavourable opinions about specific subjects Data Mining Finds hidden patterns, relationships in datasets Example: customer buying patterns Infers rules to predict future behavior Types of information obtainable from data mining: Associations Sequences Classification Clustering Forecasting Data Bases and the Web Many companies use the web to make some internal databases available to customers or partners 13
Image of page 13
Advantages of using the web for database access Ease of use of browser software Web interface requires few or no changes to database Inexpensive to add web interface to system Week 6 Notes (Session #4 – Slides) Types of Decisions 14
Image of page 14
Unstructured Decision maker must provide judgment, evaluation, and insight to solve problem Structured
Image of page 15
Image of page 16

You've reached the end of your free preview.

Want to read all 19 pages?

  • Fall '12
  • AlecCram
  • Data Mining, Database management system

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes