Lect22 - Prediction Intervals Changing Variation Outline...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Prediction Intervals Changing Variation Outline Prediction Intervals Changing Variation The Problem Fixing the Problem 1 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Fishing Case In managing commercial fishing fleets, it is necessary to understand how the level of effort (number of boat-days) influences the size of the catch. What would you predict for the crab catch in a season with 7,500 (boat-)days of effort? How accurate would you claim your prediction to be? regression analysis • We’ll use regression with Y equal to the catch near Vancouver island from 1980 to 2007 measured in thousands of pounds of Dungeness crabs with X equal to the level of effort (total number of days by boats catching Dungeness crabs). • Our task is to find the predicted value and the prediction interval. Prediction interval 2 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation Prediction Interval • Prediction interval (PI) is an interval designed to hold a fraction (usually 95%) of the values of the response for a given value xnew of x . • A prediction interval differs from a confidence interval because it makes a statement about the location of a new observation rather What is the proportion of the colour than a parameter of a population. are you going to get in a packet of • The 95% prediction interval for ynew is m&m in the future. Therefore it is When n is small, the interval become more accurate. ˆ ˆ ynew ± 2se(ynew ) random. (CI) the parameter of a population is a fixed quantity. ¯ 2 1 new − ) ˆ ˆ where ynew = b0 + b1 xnew and se(ynew ) = se 1 + n + (xn−1)x 2 ( sx ¯ • So long as we’re not extrapolating far from x and have a moderately ˆ sized sample, then se(ynew ) ≈ se ; and hence a simple approximation for a 95% prediction interval is ˆ ynew ± 2se . • Prediction intervals are reliable within the range of observed data. They are also sensitive to the assumptions of constant variance and normality. 3 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation • • • • 4 / 16 Linear association is evident Similar variance confirmed No clear dependent structure is seen Nearly normal condition is satisfied. ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation SE= Root Mean Square Error • The t −statistic (and p−value) indicate that the slope is (is/is not) significantly different from zero (at 5% significance level). • There is a statistically significant linear association between days of effort and total catch. On average, each additional day of effort (per boat) increases the harvest by about 161 pounds. • The predicted catch in a year with x = 7500 days of effort is ≈1173 thousand pounds. • The 95% prediction interval is from 927 to 1420 to thousand pounds. • There is a 95% probability that the catch will be between 927 and 1420 thousand pounds. 5 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem How do the Home Prices Depend on the Size? • How to decide whether a house is worth the asking price? • The price depends on the size – take a look at the scatter plot of price against size: 6 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Try the SRM: • The estimated intercept (50.60) can be interpreted as the fixed cost of a 50.6 + or - 2 * 27.4 home. • The 95% CI for the fixed cost (after rounding) is -$4,000 to $105,00. • Is the fixed cost significantly different from zero (at 5% significance level)? No. It is because zero is with in the interval • The slope (0.1594) estimates the marginal cost of an additional square foot of space. 0.154 +- 2*0.012076 • The 95% CI for the marginal cost is $135 to $184 per square foot. • The 95% PI for a house with 1000 square feet is about $28,000 to $392,000; • The 95% PI for a house with 3000 square feet is about $347,000 to $711,000. Use the formula on P.3 The CI has negative value. The size of the interval is too large. The PI is too wide. • Are the results reliable? 7 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Checking the Residuals Similar variances? Fan-shaped; the standard deviations of residuals increase as home size increases 8 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Detecting Differences in Variation Side-by-side boxplots confirm that variances increase as home size increases. 9 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Prediction Intervals... 1 • The 95% prediction intervals are too wide for small homes and too narrow for large homes. 1 10 / 16 Click on the red down arrow next to Linear Fit and pull to Confid Shaded Indiv or Confid Curves Indiv ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Consequences of Different Variation Consequences of Different Variation: • Prediction intervals are too narrow or too wide. • Confidence intervals for the slope and intercept are not reliable. • Hypothesis tests regarding the slope and intercept are not reliable. 11 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Fixing the Problem: Revise the Model • If F represents fixed cost and M marginal costs, the equation of the SRM becomes Price = F + M × SqFt + ε • Divide both sides of the equation by the number of square feet and simplify: Price F + M × SqFt + ε = SqFt SqFt 1 =M +F × +ε SqFt 1. The response variable becomes price per square foot and the predictor becomes the reciprocal of the number of square feet. 2. The marginal cost M is the intercept and the slope is F , the fixed cost. 12 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Fitting the Revised Model 13 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Fitting the Revised Model, ctd • The marginal cost is estimated to be $ 157 per square foot • This is the the new intercept. The 95% CI for the marginal cost is about $137 to $179 per square foot • The fixed cost is estimated to be $ 53886 • This is the new slope. The 95% CI for the fixed cost is about $18,000 to $89,000. • Is the fixed cost significantly different from zero (at 5% significance level)? • Prices for homes in this neighborhood run about $137 to $179 per square foot, on average; average fixed costs associated with the purchase are in the range $18,000 to $89,000, with 95% confidence. • The 95% PI for the price per square foot of a house with 1000 square feet is about $133/SqFt to $290/SqFt ; • The 95% PI for the total price of a house with 1000 square feet is about $ to $ ; • The 95% PI for the price per square foot of a house with 3000 square feet is about $97/SqFt to $255/SqFt ; • The 95% PI for the total price of a house with 3000 square feet is about $ to $ 14 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Comparing Models with Different Responses 15 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Comparing Models with Different Responses, ctd Even though the revised model has a smaller r 2 , • It provides more reliable and narrower confidence intervals for fixed and marginal costs; and • It provides more sensible prediction intervals. The responses are different. Therefore you can't compare the r 16 / 16 ISOM 2500 Lect 22: PI, Changing Variation ...
View Full Document

Ask a homework question - tutors are online