1 Six Sigma Lecture 7 Regression Analysis and Design of Experiments

2 Simple Linear Regression n The simplest form of a regression model: y = β 0 + β 1x, the straight line equation. The response (Y) is a function of a single (X) factor. Hence the term simple linear regression . n A response modeled as a function of many X’s is called multiple linear regression (for multiple X’s).
3 Simple Linear Regression Model y = 0 + 1x+ Error where: l y is the response variable. l x is the single input factor or predictor variable in the model. l 0 and 1are the model coefficients to be estimated from the data. l 0 is the intercept. l 1 is the slope . l Error is experimental error.

4 This table has a single X factor and a corresponding  y value. Simple Linear Regression
5 We can plot this data on a scatterplot.   Simple Linear Regression

6 Vertical distance from point to the line. Simple Linear Regression
7 Method of Least Squares yobs1 - y1 is called the  Residual  at x = x1 x 1 ) , 1 1 y obs x ( y obs 1 ) 1 , 1 y x ( x y 1 0 β + β = 1 y x y 1 1 - y obs y ^ Residuals

8 Analysis and Interpretation Simple Linear Regression
9 Hypothesis Testing: (If  b 1  is zero,  b 1x1  drops out of the  equation.  Response  y  is independent of  x .) Null hypothesis: Ho:  b 1 = 0  (No linear correlation)       H1:  b 1 = 0  (Linear correlation) / Analysis and Interpretation

10 y = b 0 + b 1x Parameter Estimates
11 Hypothesis Testing: b 0 = 0  vs.  b 0  „  0 b  1 = 0  vs.  b 1 „  0 P  - the observed significance probability (or  p-value calculated from each coefficient value.   The probability of getting by chance alone a coefficient  value greater (in absolute value) than the computed value,  given that the null hypothesis is true.  Parameter Estimates

12 Cautions on Using Regression 4 5 6 7 8 9 10 11 12 13 14 3 4 5 6 7 8 9 10 X2 Y2 Y = 3.09365 + 0.454973X R-Sq = 57.7 % Regression Plot 4 5 6 7 8 9 10 11 12 13 5 6 7 8 9 10 11 12 13 X3 Y3 Y = 3.01506 + 0.505082X R-Sq = 67.1 % Regression Plot 10 15 20 5 6 7 8 9 10 11 12 13 X4 Y4 Y = 3.91545 + 0.451818X R-Sq = 47.5 % Regression Plot All four sets of data have similar regression equations and regression lines but the data are very different. 5 10 15 4 5 6 7 8 9 10 11 X1 Y1 Y = 3.26198 + 0.453693X R-Sq = 62.9 % Regression Plot
13 Goodness of Fit Tests n A low p-value indicates that the predicted probabilities deviate from the observed probabilities in a way that tell us that fit is not good. n In MINITAB there are three goodness-of-fit tests: l Pearson, Deviance, and Hosmer-Lemeshow . n

