ch08 - STAT 2023 - Holbrook Linear Regression Chapter 8 8-1...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STAT 2023 - Holbrook Linear Regression Chapter 8 8-1 STAT 2023 - Holbrook Models 8-2 Models STAT 2023 - Holbrook 1. Representation of Some Phenomenon 2. Mathematical Model Is a Mathematical Mathematical Expression of Some Phenomenon Expression 3. Often Describe Relationships between Often Variables Variables 4. Types 8-3 Deterministic Models Probabilistic Models Deterministic Models STAT 2023 - Holbrook 1. Hypothesize Exact Relationships 2. Suitable When Prediction Error is Suitable Negligible Negligible 3. Example: Force Is Exactly Example: Mass Times Acceleration Mass F = m·a Y= X © 1984-1994 T/Maker Co. 8-4 Probabilistic Models STAT 2023 - Holbrook 1. Two Components Deterministic Random Error 2. Example: Sales Volume Is 10 Times Example: Advertising Spending + Random Error Advertising 8-5 Y = 10X + ε Random Error May Be Due to Factors Random Other Than Advertising Other STAT 2023 - Holbrook Regression Models 8-6 STAT 2023 - Holbrook 8-7 Types of Regression Models STAT 2023 - Holbrook Types of Regression Models Regression Models 8-8 STAT 2023 - Holbrook Types of Regression Models 1 Explanatory Variable Simple 8-9 Regression Models STAT 2023 - Holbrook Types of Regression Models 1 Explanatory Variable Simple 8 - 10 10 Regression Models 2+ Explanatory Variables Multiple STAT 2023 - Holbrook Types of Regression Models 1 Explanatory Variable Simple Linear 8 - 11 11 Regression Models 2+ Explanatory Variables Multiple STAT 2023 - Holbrook Types of Regression Models 1 Explanatory Variable Regression Models Multiple Simple Linear 8 - 12 12 2+ Explanatory Variables NonLinear STAT 2023 - Holbrook Types of Regression Models 1 Explanatory Variable Regression Models 2+ Explanatory Variables Multiple Simple Linear 8 - 13 13 NonLinear Linear STAT 2023 - Holbrook Types of Regression Models 1 Explanatory Variable Regression Models 2+ Explanatory Variables Multiple Simple Linear 8 - 14 14 NonLinear Linear NonLinear Types of Regression Models STAT 2023 - Holbrook Regression Models 1 Explanatory Variable 2+ Explanatory Variables Multiple Simple Linear 8 - 15 15 NonLinear Linear NonLinear STAT 2023 - Holbrook Model Specification 8 - 16 16 STAT 2023 - Holbrook 1. 2. 3. 4. Model Specification Is Based on Theory Theory of Field (e.g., Sociology) Mathematical Theory Previous Research ‘Common Sense’ 8 - 17 17 STAT 2023 - Holbrook Thinking Challenge: Which Is More Logical? Sales Sales Advertising Sales Sales Advertising 8 - 18 18 Advertising Advertising STAT 2023 - Holbrook Linear Regression Model 8 - 19 19 Linear Equations STAT 2023 - Holbrook Y Y = mX + b m = Slope Change in Y Change in X b = Y-intercept X High School Teacher © 1984-1994 T/Maker Co. 8 - 20 20 Linear Regression Model STAT 2023 - Holbrook 1. Relationship Between Variables Is a Relationship Linear Function Linear Population Population Y-Intercept Y-Intercept Population Population Slope Slope Random Random Error Error Yi = β 0 + β 1X i + ε i Dependent Dependent (Response) Variable Variable 8 - 21 21 Independent Independent (Predictor) Variable Variable STAT 2023 - Holbrook Give your hands a rest! 8 - 22 22 Population & Sample Regression Models STAT 2023 - Holbrook Population $ $ 8 - 23 23 $ $ $ Population & Sample Regression Models STAT 2023 - Holbrook Population Unknown Relationship $ Yi = β 0 + β1X i + ε i $ 8 - 24 24 $ $ $ Population & Sample Regression Models STAT 2023 - Holbrook Population Random Sample Unknown Relationship $ Yi = β 0 + β1X i + ε i $ 8 - 25 25 $ $ $ $ $ Population & Sample Regression Models STAT 2023 - Holbrook Population Unknown Relationship $ Yi = β 0 + β1X i + ε i $ 8 - 26 26 $ $ $ Random Sample Yi = b0 + b1 X i + εˆi $ $ STAT 2023 - Holbrook Population Linear Regression Model Y Yi = β 0 + β1X i + ε i Observed value ε i = Random error Random E( Y ) = β 0 + β1 X i X Observed value 8 - 27 27 STAT 2023 - Holbrook Sample Linear Regression Model Y ˆ Yi = b0 + b1 X i + εi ε^ = Random i error error ˆ y = b0 + b1 X Unsampled Unsampled observation observation X Observed value 8 - 28 28 STAT 2023 - Holbrook Estimating Parameters: Least Squares Method 8 - 29 29 Scatterplot STAT 2023 - Holbrook 1. Plot of All (Xi, Yi) Pairs 2. Suggests How Well Model Will Fit 60 40 20 0 Y 0 8 - 30 30 20 40 X 60 Thinking Challenge STAT 2023 - Holbrook How would you draw a line through the How points? How do you determine which line ‘fits best’? ‘fits 60 40 20 0 Y 0 8 - 31 31 20 40 X 60 Thinking Challenge STAT 2023 - Holbrook How would you draw a line through the How points? How do you determine which line ‘fits best’? ‘fits 60 40 20 0 Y 0 8 - 32 32 20 40 X 60 Thinking Challenge STAT 2023 - Holbrook How would you draw a line through the How points? How do you determine which line ‘fits best’? ‘fits 60 40 20 0 Y 0 8 - 33 33 20 40 X 60 Thinking Challenge STAT 2023 - Holbrook How would you draw a line through the How points? How do you determine which line ‘fits best’? ‘fits 60 40 20 0 Y 0 8 - 34 34 20 40 X 60 Thinking Challenge STAT 2023 - Holbrook How would you draw a line through the How points? How do you determine which line ‘fits best’? ‘fits 60 40 20 0 Y 0 8 - 35 35 20 40 X 60 Thinking Challenge STAT 2023 - Holbrook How would you draw a line through the How points? How do you determine which line ‘fits best’? ‘fits 60 40 20 0 Y 0 8 - 36 36 20 40 X 60 Thinking Challenge STAT 2023 - Holbrook How would you draw a line through the How points? How do you determine which line ‘fits best’? ‘fits 60 40 20 0 Y 0 8 - 37 37 20 40 X 60 Least Squares STAT 2023 - Holbrook 1. ‘‘Best Fit’ Means Difference Between Best Actual Y Values & Predicted Y Values Are a Minimum Are 8 - 38 38 Least Squares STAT 2023 - Holbrook 1. ‘‘Best Fit’ Means Difference Between Best Actual Y Values & Predicted Y Values Are a Minimum Are ∑ (Y − Yˆ ) = ∑ εˆ n i =1 8 - 39 39 2 i i n i =1 2 i Least Squares STAT 2023 - Holbrook 1. ‘Best Fit’ Means Difference Between ‘Best Actual Y Values & Predicted Y Values Are a Minimum Are (Y −Yˆ ) = ∑ε ˆ ∑ n i= 1 2 i i n 2 i i= 1 2. Least Squares (LS) Minimizes the Sum of Least these Squared Differences (Also called Errors (SSE) or Residuals (SSR).) Residuals 8 - 40 40 Least Squares Graphically STAT 2023 - Holbrook n 2 = ε1 + ε 2 + ε 2 + ε 2 2 2 3 4 LS minimizes ∑ ε i i =1 ˆ Y2 = b0 + b1 X 2 + ε 2 Y ε^ 4 ε^ 2 ε^ 1 ε^ 3 ˆ y = b0 + b1 X X 8 - 41 41 STAT 2023 - Holbrook Least Squares Coefficient Equations (pick those pencils back up!) 8 - 42 42 Coefficient Equations STAT 2023 - Holbrook Prediction Equation ˆ y = b0 + b1 X ∑ ( ∑ X )( ∑Y ) XY − n 2 (∑ X ) 2 ∑X − n Sample Slope b1 = Sample Y-intercept b0 = Y − b1 X 8 - 43 43 Coefficient Equations STAT 2023 - Holbrook (alternate slope formula) Prediction Equation ˆ y = b0 + b1 X sy Sample Slope b1 = r Sample Y-intercept b0 = Y − b1 X 8 - 44 44 sx Computation Table STAT 2023 - Holbrook Xi Yi 2 Xi X1 Y1 X 12 Y12 X1Y1 X2 Y2 X2 2 2 X2Y2 : : : Xn ΣXi 8 - 45 45 2 Yi XiYi Y2 : 2 Yn Xn ΣYi 2 ΣXi : 2 XnYn 2 ΣYi ΣXiYi Yn Interpretation of Coefficients STAT 2023 - Holbrook 1. Slope (b1) Estimated Y Changes by b1 for Each 1 Estimated Unit Increase in X If b1 = 2, then for Every 1 Unit Increase in Unit Advertising (X), Sales (Y) Is Expected to Advertising Is Increase by 2. Increase 8 - 46 46 Interpretation of Coefficients STAT 2023 - Holbrook 1. Slope (b1) Estimated Y Changes by b1 for Each 1 Estimated Unit Increase in X If b1 = 2, then for Every 1 Unit Increase in Unit Advertising (X), Sales (Y) Is Expected to Advertising Is Increase by 2. Increase 2. Y-Intercept (b0) Average Value of Y When X = 0 Average If b0 = 4, then Average Sales (Y) Is Expected Is to Be 4 When Advertising (X) Is 0 to 8 - 47 47 STAT 2023 - Holbrook Parameter Estimation Example You’re a marketing analyst for Hasbro You’re Toys. You gather the following data: Toys. Ad $ Sales (Units) 1 1 2 1 3 2 4 2 5 4 Model the relationship Model relationship between sales & advertising. between 8 - 48 48 Scatterplot Sales vs. Advertising STAT 2023 - Holbrook Sales 4 3 2 1 0 0 1 2 3 Advertising 8 - 49 49 4 5 Parameter Estimation Solution Table STAT 2023 - Holbrook Xi Xi2 Yi2 XiYi 1 1 1 1 1 2 1 4 1 2 3 2 9 4 6 4 2 16 4 8 5 4 25 16 20 15 8 - 50 50 Yi 10 55 26 37 STAT 2023 - Holbrook b1 Parameter Estimation Solution (first formula) ( ∑X )( ∑Y ) 37 − (15) (10) ∑XY − 5 n = = (15) ( ∑X ) 55 − ∑X − 2 2 2 n b 0 = Y − b1X = 2 − ( 0.70 ) ( 3) = −0.1 Λ y = −0.1 + 0.7 x 8 - 51 51 5 = 0.7 STAT 2023 - Holbrook Parameter Estimation Solution (alternate formula) sy 1.2247 b1 = r = .903696 = 0.7 sx 1.5811 b 0 = Y − b1X = 2 − 0.7(3) = −0.1 Λ y = −0.1 + 0.7 x 8 - 52 52 STAT 2023 - Holbrook Coefficient Interpretation Solution 1. Slope (b1) 8 - 53 53 For every $1 increase in advertising (X), For sales volume (Y) Is expected to increase sales Is by .7 Units (teddy bears). by STAT 2023 - Holbrook Coefficient Interpretation Solution 1. Slope (b1) For every $1 increase in advertising (X), For sales volume (Y) Is expected to increase sales Is by .7 Units (teddy bears). by 2. Y-Intercept (b0) Average Value of Sales Volume (Y) Is Is -.10 Units When Advertising (X) Is 0 -.10 Difficult to Explain to Marketing Manager Expect Some Sales Without Advertising 8 - 54 54 Using the TI­83 STAT 2023 - Holbrook 1. After the ordered pairs have been entered After into two lists do the following: into Hit “STAT” and scroll to “CALC” Select “2-Var Stats” Hit “2nd” and “1” (for L1) Hit “,” (located above 7) Hit “2nd” and “2” (for L2) Hit “ENTER” 2. All sums needed for slope and y-intercept All formulas are given. formulas 8 - 55 55 An Alternate Method Using the TI­83 STAT 2023 - Holbrook 1. After the ordered pairs have been entered After into two lists, we can find the slope and yinto intercept directly by doing the following: Hit “STAT” and scroll to “CALC” Select “LinReg(a+bx)” (8 in the CALC menu) Hit “2nd” and “1” (for L1) Hit “,” (located above 7) Hit “2nd” and “2” (for L2) Hit “ENTER” 2. The y-intercept is “a” and the slope is “b”. 8 - 56 56 STAT 2023 - Holbrook Parameter Estimation Computer Output (optional) Parameter Estimates Parameter Variable DF INTERCEP 1 ADVERT 1 b0 8 - 57 57 Parameter Standard T for H0: Parameter Estimate Error Param=0 Estimate -0.1000 0.6350 -0.157 -0.1000 0.7000 0.1914 3.656 0.7000 b1 Prob>|T| 0.8849 0.0354 STAT 2023 - Holbrook Parameter Estimation Thinking Challenge You’re an economist for the county You’re cooperative. You gather the following data: cooperative. Fertilizer (lb.) Yield (lb.) 4 3.0 6 5.5 10 6.5 12 9.0 Model the relationship Model relationship between fertilizer & crop yield. between © 1984-1994 T/Maker Co. 8 - 58 58 STAT 2023 - Holbrook Scatterplot Crop Yield vs. Fertilizer* Yield (lb.) 10 8 6 4 2 0 0 5 10 Fertilizer (lb.) 8 - 59 59 15 Parameter Estimation Solution Table* STAT 2023 - Holbrook Xi 4 3.0 16 9.00 12 6 5.5 36 30.25 33 10 6.5 100 42.25 65 12 9.0 144 81.00 108 32 8 - 60 60 Yi 2 Xi 2 Yi 24.0 296 162.50 218 XiYi STAT 2023 - Holbrook b1 = ∑ Parameter Estimation Solution* (first formula) ( ∑ X )( ∑ Y ) XY − n 2 ( ∑X) 2 ∑X − n = ( 32) ( 24) 218 − 4 2 ( 32) 296 − 4 b 0 = Y − b1X = 6 − ( 0.65) ( 8) = 0.8 Λ y = 0.8 + 0.65x 8 - 61 61 = 0.65 STAT 2023 - Holbrook Parameter Estimation Solution* (second formula) sy 2.4833 b1 = r = .95578 = 0.65 sx 3.6515 b 0 = Y − b1X = 6 − 0.65(8) = 0.8 Λ y = 0.8 + 0.65x 8 - 62 62 STAT 2023 - Holbrook Coefficient Interpretation Solution* 1. Slope (b1) 8 - 63 63 For every 1 lb. increase in fertilizer (X), crop ), yield (Y) will increase by .65 lb. yield STAT 2023 - Holbrook Coefficient Interpretation Solution* 1. Slope (b1) For every 1 lb. increase in fertilizer (X), crop ), yield (Y) will increase by .65 lb. yield 2. Y-Intercept (b0) 8 - 64 64 Average Crop Yield (Y) Is Expected to Be Is 0.8 lb. When No Fertilizer (X) Is Used 0.8 STAT 2023 - Holbrook The Coefficient of Determination 8 - 65 65 STAT 2023 - Holbrook Coefficient of Determination, R2 1. Proportion of the variability in Y, that is explained by X is 0 ≤ R2 ≤ 1 Explained Variation R= Total Variation 2 ∑ (Y − Y ) − ∑ (Y − Yˆ ) n = i =1 i i i =1 ∑ (Y − Y ) n i =1 8 - 66 66 n 2 i 2 2 STAT 2023 - Holbrook Y Coefficient of Determination Examples Y R2 = 1 R2 = 1 X Y Y R2 = .8 X 8 - 67 67 X R2 = 0 X STAT 2023 - Holbrook Coefficient of Determination Example You’re a marketing analyst for Hasbro You’re Toys. You find b0 = -0.1 & b1 = 0.7. Toys. Ad $ Sales (Units) 1 1 2 1 3 2 4 2 5 4 Find and interpret R2 (the coefficient of determination). 8 - 68 68 Solution Table STAT 2023 - Holbrook Xi Xi2 Yi2 XiYi 1 1 1 1 1 2 1 4 1 2 3 2 9 4 6 4 2 16 4 8 5 4 25 16 20 15 8 - 69 69 Yi 10 55 26 37 Recall (Chapter 7) that: STAT 2023 - Holbrook r= ∑ ( ∑ X )( ∑Y ) XY − n ( n −1) s x s y (15)(10) 37 − 5 r= = .904 ( 5 −1) (1.581)(1.225) 8 - 70 70 Solution and Interpretation STAT 2023 - Holbrook r = 0.904 so R2 = 0.817 0.817 81.7% of the variability in sales (Y) of 81.7% teddy bears is explained by advertising dollars (X). dollars 8 - 71 71 Turn on Your Diagnostics! (TI­83) STAT 2023 - Holbrook 1. We can find r (correlation coefficient) and R2 (coefficient of determination) on our TI-83 directly but first your diagnostics must be on. directly Scroll down until you see “DiagnosticOn” and hit and “ENTER” Hit “ENTER” again. 8 - 72 72 Hit “2nd” and then “0” (accessing the CATALOG) Your TI-83 screen should say “Done” An Alternate Method Using the TI­83 STAT 2023 - Holbrook 1. After Diagnostics are turned on and after the After ordered pairs have been entered into two lists, we can find r (correlation coefficient) and R2 (coefficient of determination) directly and by doing the following: by 8 - 73 73 Hit “STAT” and scroll to “CALC” Select “LinReg(a+bx)” (8 in the CALC menu) Hit “2nd” and “1” (for L1) Hit “,” (located above 7) Hit “2nd” and “2” (for L2) Hit “ENTER” End of Chapter Any blank slides that follow are blank intentionally. ...
View Full Document

This note was uploaded on 12/07/2011 for the course STA 2023 taught by Professor Staff during the Fall '11 term at Santa Fe College.

Ask a homework question - tutors are online