This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Prediction Intervals Changing Variation Outline Prediction Intervals Changing Variation
The Problem
Fixing the Problem 1 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Fishing Case
In managing commercial ﬁshing ﬂeets, it is necessary to
understand how the level of effort (number of boatdays) inﬂuences
the size of the catch. What would you predict for the crab catch in a
season with 7,500 (boat)days of effort? How accurate would you
claim your prediction to be?
regression analysis
• We’ll use regression with Y equal to the catch near Vancouver island from 1980 to 2007 measured in thousands of pounds of
Dungeness crabs with X equal to the level of effort (total
number of days by boats catching Dungeness crabs).
• Our task is to ﬁnd the predicted value and the prediction interval.
Prediction interval 2 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation Prediction Interval
• Prediction interval (PI) is an interval designed to hold a fraction
(usually 95%) of the values of the response for a given value xnew
of x .
• A prediction interval differs from a conﬁdence interval because it
makes a statement about the location of a new observation rather
What is the proportion of the colour
than a parameter of a population.
are you going to get in a packet of
• The 95% prediction interval for ynew is
m&m in the future. Therefore it is When n is small, the interval
become more accurate. ˆ
ˆ
ynew ± 2se(ynew ) random. (CI) the parameter of a
population is a fixed quantity.
¯ 2 1
new − )
ˆ
ˆ
where ynew = b0 + b1 xnew and se(ynew ) = se 1 + n + (xn−1)x 2
(
sx
¯
• So long as we’re not extrapolating far from x and have a moderately
ˆ
sized sample, then se(ynew ) ≈ se ; and hence a simple approximation
for a 95% prediction interval is
ˆ
ynew ± 2se .
• Prediction intervals are reliable within the range of observed data.
They are also sensitive to the assumptions of constant variance and
normality.
3 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation •
•
•
•
4 / 16 Linear association is evident
Similar variance conﬁrmed
No clear dependent structure is seen
Nearly normal condition is satisﬁed.
ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation SE= Root Mean Square
Error • The t −statistic (and p−value) indicate that the slope is (is/is not)
signiﬁcantly different from zero (at 5% signiﬁcance level).
• There is a statistically signiﬁcant linear association between days of
effort and total catch. On average, each additional day of effort (per
boat) increases the harvest by about 161 pounds.
• The predicted catch in a year with x = 7500 days of effort is
≈1173 thousand pounds. • The 95% prediction interval is from 927 to 1420
to thousand pounds.
• There is a 95% probability that the catch will be between 927 and 1420
thousand pounds. 5 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem How do the Home Prices Depend on the Size?
• How to decide whether a house is worth the asking price?
• The price depends on the size – take a look at the scatter plot of price against size: 6 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Try the SRM: • The estimated intercept (50.60) can be interpreted as the ﬁxed cost of a
50.6 + or  2 * 27.4
home.
• The 95% CI for the ﬁxed cost (after rounding) is $4,000 to $105,00.
• Is the ﬁxed cost signiﬁcantly different from zero (at 5% signiﬁcance
level)? No. It is because zero is with in the interval • The slope (0.1594) estimates the marginal cost of an additional square foot
of space. 0.154 + 2*0.012076
• The 95% CI for the marginal cost is $135 to $184 per square foot.
• The 95% PI for a house with 1000 square feet is about $28,000 to $392,000;
• The 95% PI for a house with 3000 square feet is about $347,000 to
$711,000. Use the formula on P.3 The CI has negative value. The size of the
interval is too large. The PI is too wide.
• Are the results reliable?
7 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Checking the Residuals Similar variances? Fanshaped; the standard deviations of residuals increase as home
size increases
8 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Detecting Differences in Variation Sidebyside boxplots conﬁrm that variances increase as home size
increases. 9 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Prediction Intervals... 1 • The 95% prediction intervals are too wide for small homes and too narrow for large homes. 1
10 / 16 Click on the red down arrow next to Linear Fit and pull to Conﬁd Shaded Indiv or Conﬁd Curves Indiv
ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Consequences of Different Variation
Consequences of Different Variation:
• Prediction intervals are too narrow or too wide.
• Conﬁdence intervals for the slope and intercept are not reliable.
• Hypothesis tests regarding the slope and intercept are not reliable. 11 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Fixing the Problem: Revise the Model
• If F represents ﬁxed cost and M marginal costs, the equation of the SRM becomes
Price = F + M × SqFt + ε • Divide both sides of the equation by the number of square feet and simplify:
Price
F + M × SqFt + ε
=
SqFt
SqFt
1
=M +F ×
+ε
SqFt
1. The response variable becomes price per square foot and the
predictor becomes the reciprocal of the number of square feet.
2. The marginal cost M is the intercept and the slope is F , the
ﬁxed cost.
12 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Fitting the Revised Model 13 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Fitting the Revised Model, ctd
• The marginal cost is estimated to be $ 157 per square foot
• This is the the new intercept. The 95% CI for the marginal cost is about
$137 to $179 per square foot
• The ﬁxed cost is estimated to be $ 53886
• This is the new slope. The 95% CI for the ﬁxed cost is about $18,000 to
$89,000.
• Is the ﬁxed cost signiﬁcantly different from zero (at 5% signiﬁcance
level)?
• Prices for homes in this neighborhood run about $137 to $179 per square
foot, on average; average ﬁxed costs associated with the purchase are in the
range $18,000 to $89,000, with 95% conﬁdence.
• The 95% PI for the price per square foot of a house with 1000 square feet is
about $133/SqFt to $290/SqFt ;
• The 95% PI for the total price of a house with 1000 square feet is about
$
to $
; • The 95% PI for the price per square foot of a house with 3000 square feet is
about $97/SqFt to $255/SqFt ;
• The 95% PI for the total price of a house with 3000 square feet is about
$
to $
14 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Comparing Models with Different Responses 15 / 16 ISOM 2500 Lect 22: PI, Changing Variation Prediction Intervals Changing Variation The Problem Fixing the Problem Comparing Models with Different Responses, ctd
Even though the revised model has a smaller r 2 ,
• It provides more reliable and narrower conﬁdence intervals for ﬁxed and marginal costs; and
• It provides more sensible prediction intervals. The responses are different.
Therefore you can't compare the r 16 / 16 ISOM 2500 Lect 22: PI, Changing Variation ...
View
Full
Document
 Spring '11
 a

Click to edit the document details