This preview shows pages 1–18. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1 Part 3 REGRESSION 2 REGRESSION / CORRELATION Object: To measure the degree of association between variables and/or to predict the value of one variable from the knowledge of the values of (an)other variable(s). Relationships: (1) Functional (2) Statistical 3 Functional Relationship: Y=f(X), an exact relationship no error. e.g., Y = 25 +.10X $ spent on books during the year $ savings (joining a book club) 4 Statistical Relationship: (true only on the average) Y PRODUCTION X LABOR HOURS Linear Y PHYSICAL ABILITY X AGE Nonlinear 5 Consider the following data, which represent the sales of a product (adjusted for trend) over the last 8 sales periods: Y = sales (millions) 116 109 117 112 122 113 108 115 Y = 114 What would (should) one predict for the next sales period? Probably, one would be hard pressed, in this case, to justify choosing other than Y=114. How good will this prediction be? 6 WE DONT KNOW!!!!! 7 But we can get an idea by looking at how well we would have done, had we been using this 114 all along: TSS = Total Sum of Squares So TSS = ( Y Y) 2 = 144 j n j=1 Y Y (YY) (YY) 2 116 109 117 112 122 113 108 115 114 114 114 114 114 114 114 114 25 32 816 1 4 25 9 4 64 1 36 1 Y=114 144 8 Two ways to look at the TSS: 1) A measure of the misprediction (prediction error) using Y as predictor. 2) A measure of the Total Variability in the System (the amount by which the 8 data values arent all the same). 9 Consider using X, advertising, to help predict Y: 105 110 115 120 125 1 2 3 4 X Y Scatter Diagram Y X 1 1 6 1 0 9 1 1 7 1 1 2 1 2 2 1 1 3 1 0 8 1 1 5 2 1 3 1 4 2 1 2 Y = 1 1 4 X = 2 10 10 Consider a Linear or Straight Line Statistical relationship between the two variables, and then consider finding the best fitting line to the data. Call this line: Y c = a + bX Y c = Computed Y or Predicted Y Y is called the Dependent Variable X is called the Independent Variable 11 11 What do we mean by best fitting? Answer: The Least Squares line , i.e., the line which minimizes the sum of the squares of the distances from the dots, Y, and the line, Y c . Hence, the MATH problem is to minimize ( Y Y c ) 2 n j=1 Y 1 = 7 Y c1 = 5 X 1 Y X 12 12 To find this Least Squares line, we theoretically need calculus. However, as a practical matter, every text gives the answer, and, more importantly, we will get the result using Excel, or SPSS, or other software  NOT BY HAND. (There is an arithmetic formula for b and a in terms of the sum of the Xs, the sum of the Ys, the sum of the XYs, etc., but with software available, we never use it.) 13 13 105 110 115 120 125 1 2 3 4 Least squares line Y c = 106 + 4X 14 14 15 15 16 17 17 So, using X in the best way, we have a prediction line of Y c =106+4X. How good are the predictions well get using this line? Suppose we had been using it: TSS SSE 106+4(2) (YY) 2 (YY) Y X Y c YY c (YY c ) 2 4 25...
View Full
Document
 Spring '11
 KEATING

Click to edit the document details