EMBA 7100
Analytical Tools for
Decision Making
Regression Basics
Simple Linear Regression Model
n The equation that describes how y is related to x and
an error term is called the regression model.
n The simple linear regression model is:
y = 0 + 1x +
whe
STATISTI
CS
RECAP
BY: BREANNA
WHEELER
CALCULATING THE Z-SCORE
STEP 1: FIND THE MEAN OF THE SET OF DATA.
STEP 2: SUBTRACT THE MEAN (N) FROM EACH
NUMBER IN THE SET OF DATA.
STEP 3: SQUARE ALL THE NUMBERS THAT YOU
JUST SUBTRACTED THE MEAN FROM.
STEP 4: A
What is multicollinearity?
Multicollinearity in regression is a condition that occurs when some predictor
variables in the model are correlated with other predictor variables. Severe
multicollinearity is problematic because it can increase the variance of
What is a variance inflation
factor (VIF)?
Variance inflation factors (VIF) measure how much the variance of the estimated
regression coefficients are inflated as compared to when the predictor variables
are not linearly related.
Use to describe how much
Lack-of-fit and lack-of-fit tests
What is lack-of-fit?
A regression model exhibits lack-of-fit when it fails to adequately describe the
functional relationship between the experimental factors and the response
variable. Lack-of-fit can occur if important
How to Choose the Best Regression
Model
Adjusted R-squared and Predicted R-squared: Generally, you choose the models
that have higher adjusted and predicted R-squared values. These statistics are
designed to avoid a key problem with regular R-squaredit in
The Anderson-Darling statistic
IN THIS TOPIC
What is the Anderson-Darling statistic?
Show the Anderson-Darling statistic on a normal probability plot
What is the Anderson-Darling statistic?
Anderson Darling Test
First, a high-level overview of what the An
How Do I Interpret the P-Values in Linear
Regression Analysis?
e p-value for each term tests the null hypothesis that the coefficient is equal to
zero (no effect). A low p-value (< 0.05) indicates that you can reject the null
hypothesis. In other words, a
Key Limitations of R-squared
R-squared cannot determine whether the coefficient estimates and predictions
are biased, which is why you must assess the residual plots.
R-squared does not indicate whether a regression model is adequate. You can
have a low R
Coefficient of Variation (CV)
The coefficient of variation is reported as a percentage and calculated from the average and
standard deviation as follows:
100 * Standard Deviation
Average
For example, a CV of 3% means the standard deviation is equal to 3%
The prediction sum of squares (PRESS), similar to the sum of squares of the
residual error (SSE), is the sum of squares of the prediction error. PRESS differs
from the sum of squares of the residual error in that each fitted value, Y hat i, for
PRESS is o
IT EM
ABV
Volume(ml)
Price per unit
Difference Between Cheapest(ABV/100) Proposed of Alcohol
ml of Alcohol Price and * (volume)Minimum
Cheapest price/ml
Proposed Minimum Price/ml of Alcohol price as % of proposed price observed of cheapest price (Proposed