This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1 Part 3 REGRESSION 2 REGRESSION / CORRELATION Object: To measure the degree of association between variables and/or to predict the value of one variable from the knowledge of the values of (an)other variable(s). Relationships: (1) Functional (2) Statistical 3 Functional Relationship: Y=f(X), an exact relationship no “error”. e.g., Y = 25 +.10X $ spent on books during the year $ savings (joining a book club) 4 Statistical Relationship: (true only “on the average”) Y PRODUCTION X LABOR HOURS Linear Y PHYSICAL ABILITY X AGE Nonlinear 5 Consider the following data, which represent the sales of a product (adjusted for trend) over the last 8 sales periods: Y = sales (millions) 116 109 117 112 122 113 108 115 Y = 114 What would (should) one predict for the next sales period? Probably, one would be hard pressed, in this case, to justify choosing other than Y=114. How good will this prediction be? 6 WE DON’T KNOW!!!!! 7 But we can get an idea by looking at how well we would have done, had we been using this 114 all along: TSS = Total Sum of Squares So TSS = Σ ( Y Y) 2 = 144 j n j=1 Y Y (YY) (YY) 2 116 109 117 112 122 113 108 115 114 114 114 114 114 114 114 114 25 32 816 1 4 25 9 4 64 1 36 1 Y=114 144 8 Two ways to look at the “TSS”: 1) A measure of the “misprediction” (prediction error) using Y as predictor. 2) A measure of the “Total Variability in the System” (the amount by which the 8 data values aren’t all the same). 9 Consider using X, advertising, to “help” predict Y: 105 110 115 120 125 1 2 3 4 X Y Scatter Diagram Y X 1 1 6 1 0 9 1 1 7 1 1 2 1 2 2 1 1 3 1 0 8 1 1 5 2 1 3 1 4 2 1 2 Y = 1 1 4 X = 2 10 10 Consider a Linear or Straight Line Statistical relationship between the two variables, and then consider finding the “best fitting line” to the data. Call this line: Y c = a + bX Y c = “Computed Y” or “Predicted Y” Y is called the Dependent Variable X is called the Independent Variable 11 11 What do we mean by “best fitting”? Answer: The “Least Squares” line , i.e., the line which minimizes the sum of the squares of the distances from the “dots”, Y, and the “line”, Y c . Hence, the MATH problem is to minimize Σ ( Y Y c ) 2 n j=1 Y 1 = 7 Y c1 = 5 X 1 Y X 12 12 To find this Least Squares line, we theoretically need calculus. However, as a practical matter, every text gives the answer, and, more importantly, we will get the result using Excel, or SPSS, or other software  NOT “BY HAND.” (There is an arithmetic formula for “b” and “a” in terms of the sum of the X’s, the sum of the Y’s, the sum of the X•Y’s, etc., but with software available, we never use it.) 13 13 105 110 115 120 125 1 2 3 4 Least squares line Y c = 106 + 4X 14 14 15 15 16 17 17 So, using X in the best way, we have a prediction line of Y c =106+4X. How good are the predictions we’ll get using this line? Suppose we had been using it: TSS SSE 106+4(2) (YY) 2 (YY) Y X Y c YY c (YY c ) 2 4 25...
View
Full
Document
This note was uploaded on 03/02/2011 for the course PPF 501 taught by Professor Keating during the Spring '11 term at Bentley.
 Spring '11
 KEATING

Click to edit the document details