Chapter 4 Dimension
Reduction
Data Mining for Business
Intelligence
Shmueli, Patel & Bruce
Galit Shmueli and Peter Bruce 2010
Exploring the data
Statistical summary of data: common metrics
Average
Median
Minimum
Maximum
Standard deviation
Counts & percen
Sangeeta Naidu
Assignment 3- Chapter 4
October 16th 2015
Dr Tuan Tran
Data Mining and Distributed Computing
Q 1. Breakfast Cereals. Use the data for the breakfast cereal
example in Section 4.7 to explore and summarize the data as
follows: (Note that a few
Social Networking Site
1
Social Networking Site
Nikhil Bhatagalikar (NBHATA1304)
Sullivan University
Social Networking Site
2
Hardware and software architecture required for a social networking site (such as Facebook or
LinkedIn):
There are various option
Data Mining
Chapter 4/ Week 3 Homework
4.1 Breakfast Cereals. Use the data for the breakfast cereal example in Section 4.8 to explore and summarize the
data as follows. (Note that a few records contain missing values; since there are just a few, a simple
CSC550Z: Data Mining & Distributed Computing (Summer 2016)
Week 1 Assignment Solution (100 points)
2.1 Assuming that data mining techniques are to be used in the following cases, identify whether
the task required is supervised or unsupervised learning. (
(mu—Wang. — x
su|Warm:ngel\earnlng.com/Section,lAssessmentlQuestion/GradeDeliveryaspﬂentry‘d =927604CAD7664132BEEC65E637D452ED5LresponseJd=BQCCSCZSCADMBDQBFQSDETED7936934
1. when a dam mining model assigns an observation tn nne dass but in fart it belnngs
CSC550Z Fall 2015
Data Mining and Distributed Computing
Chapter 3: Data Visualization
Instructor: Dr. Tuan Tran
Galit Shmueli and Peter Bruce 2010
Graphs for Data Exploration
Basic Plots
Line Graphs
Bar Charts
Scatterplots
Distribution Plots
Boxplots
His
Sangeeta Naidu
Assignment 2- Chapter 3
October 9th, 2015
Dr Tuan Tran
Data Mining and Distributed Computing
Q1. Shipments of Household Appliances: Line Graphs. The file
ApplianceShipments.xls contains the series of quarterly shipments
(in million $) of U.
CSC550Z: Data Mining & Distributed Computing (Summer 2016)
Week 2 Assignment Solution (100 points)
3.1 Shipments of household appliances: line graphs. The file ApplianceShipments.xls
contains the series of quarterly shipments (in million $) of US househol
Sangeeta Naidu
Assignment 4- Chapter 5
October 21st, 2015
Dr. Tuan Tran
Datamining and Distributed Learning
Problems:
Q1. A data mining routine has been applied to a transaction data
set and has classified 88 records as fraudulent (30 correctly so) and
95
Sangeeta Naidu
Assignment 1-Chapter 2
October 6th, 2015
Dr Tuan Tran
Data Mining and Distributed Computing
Q1. Assuming that data mining techniques are to be used in the
following cases identify if the task required is supervised or
unsupervised learning
Chapter 5/ Week 4 Homework
5.1 A data mining routine has been applied to a transaction data set and has classified 88 records as fraudulent (30
correctly so) and 952 as non-fraudulent (920 correctly so) construct the classification matrix and calculate th
1.
a. Create a well formatted time plot of the data using Excel
Quarter Shipments
5000
4800
4600
4400
4200
4000
3800
3600
3400
3200
b. Yes.
Quarter Shipments
5100
4900
4700
4500
4300
4100
3900
3700
3500
c. The following plot shows that the shipments in Q2
1.
a. Create a well formatted time plot of the data using Excel
Quarter Shipments
5000
4800
4600
4400
4200
4000
3800
3600
3400
3200
b. Yes.
Quarter Shipments
5100
4900
4700
4500
4300
4100
3900
3700
3500
c. The following plot shows that the shipments in Q2
Chapter 2: Do Problems 1-3, 5, 8, 10: Submit to Drop Box 1.1
2.1 Assuming the data mining techniques are to be used in the following cases, identify whether the task
required is supervised or unsupervised learning.
a. Deciding whether to issue a loan to a
Talluri Prasanth
2015JULB02046
PGDM-FINANCE
Assignemnt-2
Regression Model: Predicting airfares on new routes
Scenario:
Several airports have opened in major cities in USA, opening the market for new
routes. In order to price flights on these routes a majo
Fin 40230/70230
Business Forecasting
Prof. Barry Keating
K-Nearest Neighbor Exercise #1
Purpose: To learn how to build a K-Nearest Neighbors model for prediction purposes.
We will use the validation data set to determine the optimal number of neighbors in
CSC550 Project Proposal
Predicting Airfare: Southwest Airlines
Team Members: Bhushan Barakale,
Mohammad Bhuiyan, Sri Charan
Annamraju, Richard Leister
CSC550Z: Data Mining
Sullivan University
February 12, 2017
Instructor: Dr. Tuan Tran
1
Airline Industry
Project Budget
A. General Information
Project Team: Aluri, Chamoli, Dehghan Khalili, Ghali
Project Name: Implementation of Reporting Solutions
Project Sponsor/department: Director of Information Technology, ABC Industries
Date: 11/13/2016
The project budg
Project Budget
A. General Information
Project Team: Aluri, Chamoli, Dehghan Khalili, Ghali
Project Name: Implementation of Reporting Solutions
Project Sponsor/department: Director of Information Technology, ABC Industries
Date: 11/13/2016
The project budg
Date
ConfiguratiCustomer Post Store Post Retail Pric Screen Size attery Lif RAM (GB) Processor Integrated HD Size (G
B
Bundled Ap X Cust OS Y Cust OS X Store Y StoreCustomerS
OS
OS
2008/01/01 00:01:19
163 EC4V 5BH
SE1 2BN
455
15
5
1
2 Yes
80 Yes
532041
18
CSC550Z Fall 2015
Data Mining and Distributed Computing
Chapter 1-2
Overview of Data Mining
Instructor: Dr. Tuan Tran
Why Mine Data?
Massive data is collected and warehoused
Web data, e-commerce
Purchase at grocery stores
Bank/credit card transactions
Course Number:
CSC550x
Course Title:
Data Mining
Prerequisites:
CSC499 and Knowledge of MS Excel. Knowledge of
math and statistics is helpful.
Department/Scho
ol:
School of Business
Description:
This course introduces the basic ideas and techniques
of dat
Running head: DECISION SUPPORT SYSTEM AND BUSSINESS INTELLIGENCE
Introduction
The introduction of big data in the world of computing has revolutionized sales and
marketing in the business world. Millions of data points help businesses to tackle and custom
1. Name: Name of cereal
2. mfr: Manufacturer of cereal where A = American Home Food Products; G = General Mills; K = Kelloggs; N = Nabisco
3. type: cold or hot
4. calories: calories per serving
5. protein: grams of protein
6. fat: grams of fat
7. sodium:
1.
a.
i. Quantitative variables are those categories that can be measured. In this
example, we can measure calories, protein fat, sodium, fiber, carbo, sugars,
potassium and vitamins
ii. Nominal variables are those variables that are placed in many catego
1. Name: Name of cereal
2. mfr: Manufacturer of cereal where A = American Home Food Products; G = General Mills; K = Kelloggs; N = Nabisco;
3. type: cold or hot
4. calories: calories per serving
5. protein: grams of protein
6. fat: grams of fat
7. sodium: