Assignment #1
Richard Forkin
Introduction:
For this report, I will be analyzing a dataset in preparation for building predictive models. This is always a
critical step in building predictive models because the analyst needs to understand the data they are

Principal Components Analysis with SAS
In this document we will outline the SAS procedures for performing principal components
analysis using the SAS procedure PROC PRINCOMP. In addition to the standard SAS
arguments, we will focus on the SAS options need

PREDICT 410: Predictive Modeling I Final Exam Study Guide
The final exam for PREDICT 410 will be administered in two parts: (1) a proctored exam
administered through Canvas and monitored by ProctorU, and (2) an unproctored take-home
exam administered thro

PREDICT 410: Predictive Modeling I Syllabus
INSTRUCTOR:
Winter 2016
William T. Mickelson, Ph.D.
[email protected]
TEACHING ASSISTANT: Laurence Schneider
[email protected]
Course Description
This course develops the found

Study Questions For Predict 410
Topic: Multivariate Statistical Techniques
Our learning format requires that you complete the assigned readings
efficiently and intelligently. In order to help you focus your
attention on important concepts in the course re

Handout: Best Practice of Modeling Process in a Business Environment
PREDICT 410: Predictive Modeling I
Phase 1: Project Scoping
Understand the business objective.
What will the model be used for? How will the model be applied?
What data are available for

Assignment #8
Richard Forkin
Introduction:
In this report, we will start with a simple exploratory data analysis and then work on clustering of data.
We will be using a variable to group that contains three options and then an Other bucket. We will
decide

Assignment #5
Richard Forkin
Introduction:
For the report below, we will be developing a model to predict SalePrice using categorical values. We
will dummy code these values and work through attempting to refit the model to improve the models
predictabili

Assignment #7
Richard Forkin
Introduction:
For this report, we will be working with the stock portfolio data set. We will be eliminating some of the
companies so that we are only left with Banking, Oil Field Services, Oil Refining, and Industrial
Chemica

Handout: Introduction to Multi-Dimensional Segmentation and Segmented Modeling
PREDICT 410: Predictive Modeling I
Multi-dimensional approach organizes answers to key questions across all marketing channels.
Who are my customer segments?
What do they look

Study Questions For Predict 410
Topic: Ordinary Least Squares Regression
Our learning format requires that you complete the assigned readings
efficiently and intelligently. In order to help you focus your
attention on important concepts in the course read

Study Questions For Predict 410
Topic: Logistic Regression
Our learning format requires that you complete the assigned readings
efficiently and intelligently. In order to help you focus your
attention on important concepts in the course reading, we have
c

ON NESTED MODELS
CONTEXT
Consider a modeling situation where you are trying to predict Math Achievement from a standardized
test (Y) based on a number of predictor variables, X's. If you think about the nature of the predictor
variables in a dataset, they

Handout: Introduction to Segmentation Analysis
PREDICT 410: Predictive Modeling I
Why is segmentation necessary?
To enable customer centricity
To understand differences in customer sub populations
To understand regional or value differences
To fine tune a

Cluster Analysis
1. Introduction
To this point we have been primarily concerned with the relationships that
exist among variables. The objects upon which the variables were measured
were assumed to be homogeneous in nature; that is, there was no reason to

FOR: Market
Insights
Professionals
The Facebook Factor
by gina sverdlov , April 9, 2012
Key TaKeaWays
Logistic Regression Modeling assesses The Likelihood of other events
Market insights professionals can use logistic regression modeling to quantify
the i

Assignment #3
Richard Forkin
Introduction:
For the report below, we will be examining transformed data and how we can use it in our models. From
there we will also get into outliers and see how the model performs when we take some of the outliers
out of t

Assignment #6
Richard Forkin
Introduction:
For the report below we will work through Principal Component Analysis as a method of dimension
reduction. The expectation is that we will create a model based on a larger number of variables and
create a better

Final
Andrew G. Dunn1
1
[email protected]
Andrew G. Dunn, Northwestern University Predictive Analytics Program
Prepared for PREDICT-410: Regression & Multivariate Analysis.
Formatted using markdown, pandoc, and LATEX. References managed usi

Analysis of Variance and Related Topics
for Ordinary Least Squares Regression
1
The ANOVA Table for OLS Regression
The Analysis of Variance or ANOVA Table is a fundamental output from a
tted OLS regression model. The output from the ANOVA table is used fo

Statistical Assumptions for Ordinary Least Squares Regression
1
Introduction
In Ordinary Least Squares (OLS) regression we wish to model a continuous random variable Y (the response variable) given a set of predictor
variables X1 , X2 , . . ., Xk .
Whil

Statistical Inference Versus Predictive Modeling
in Ordinary Least Squares Regression
1
Introduction
There are two reasons to build statistical models: (1) for inference, and
(2) for prediction.
Statistical inference is focused on a set of formal hypoth

Estimation and Inference for Ordinary Least Squares Regression
1
Estimation - Simple Linear Regression
A simple linear regression is the special case of an OLS regression model
with a single predictor variable.
Y = 0 + 1 X +
(1)
For the ith observation

Factor Analysis with SAS
In this document we will outline the SAS procedures for performing the most common types of
factor analysis using the SAS procedure PROC FACTOR. By doing so, we will focus on calling
the correct SAS options to ensure that we are f

Introduction to Principal Components Analysis
1
What is Principal Components Analysis?
Statistical Interpretation - PCA is a transformation of a set of correlated random variables to a set of uncorrelated (or orthogonal) random
variables.
Linear Algebra

Summary of the linear regression component of Predict 410:
(1) Experienced the formulation of a typical data modeling project.
Every data modeling project starts with some data and a rough
problem, if you are lucky. Some data modeling projects start with

Assignment #3: Regression Model Building Part 2
PREDICT 410
Data: The data for this assignment is the Ames, Iowa housing data set. This data will be made available
by your instructor.
Assignment Tasks
In this assignment we continue building regression mod

Assignment #5: Regression Model Building Part 3
Dummy Coding Automated Variable Selection Methods - Validation
PREDICT 410
Data: The data for this assignment is the Ames, Iowa housing data set. This data will be made available
by your instructor.
Assignme

Assignment #7: Factor Analysis (20 points)
Data:
The data for this assignment is the stock portfolio data set. This data will be made available by
your instructor.
Currently there is no up to date SAS reference for multivariate analysis and the new SAS SG

Assignment #2
Richard Forkin
Introduction:
For the report below we will be investigating data variables to determine their predictability of Saleprice
in the Ames Housing dataset. We will start with linear regression on continuous variables and then
revie

Inference Statistical Assumptions When validating the in-sample t of a linear regression model, what
are two assumptions that must be validated using the model residuals, and how does one validate each
of these assumptions? Response: By in-sample I assume

Assignment #4
Richard Forkin
Introduction:
For the report below, we work through reading the model output from SAS. We will compute the
numbers that are output from SAS to show that we understand how the numbers are calculated.
Results:
How many observati