This** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
This** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **To ﬁnd the prediction interval you need to calculate the standard error for predicting an individual response \Ar: hide variables n = sample size = 20
5 = standard error of the estimate = 41235923395...
i mean of explanatoriir variable = 2,286 C111>>>>>>>111111¢>>>>>><<<<<<<>>>>>>>1111111>>>>It SSX = sum of squares of explanatory variable = 20:, - i): = (n-1)5,,2 = 14,145,680
x = point of interest = 2,500
SE? = standard error for predicting an individual response
— 2
SE; =s><\/(1+ +%L) _ 1 £2,500 — 2,235)2
— 41275928395... x \l( 1 + 20 + 14’145f680 ) 42160395191... Therefore, the prediction interval is given bv: hide variables = (a = 0.05) critical value in the t distribution with 18 degrees of freedom = 2.101 n.mmm.mm.u
H- ‘(
H-
Ff
X
to ‘5'
II 1,975.49261683... i 2.101 x 42160395191... 1,085.500?1386. . . s
E S 2,865.48451QYQ...
1,086 5 r
; 2,865 Rounded as last step :| Feedback [1 out of 2] You are partly correct. 55):: this option should have been selected. 1: you are correct. at: this option should have been selected. n: this option should have been selected. - xi: this option should have been selected. - the sample size: you are oomect. - the level “signiﬁcance used: you are oorrect.
- SST: this is not correct. Discussion Suppose the oonﬁdenoe Interval for the mean value ofv Is to be oonstructed at the value x = xi, and the size ofthe sample drawn is n. The 100 x (1 - o)%
oonfidenoe interval for E(v) is: a * tarzsvlhi wh ere: vi = the predicted value for v when x = xi tap the of2 critical value of the t distribution with n — 2 degrees of freedom
5 = the standard error of the estimate 1 + (x. — W " 55): hi: So there are several factors involved in constructing the oonfidenoe interval. So there are several factors involved in constructing the confidence lnterva1. )q, x and SSX One of the factors involved is how far awav the level of x is from the sample mean value for x. The further awav the level of x is. the wider the conﬁdence
interval. There is a speciﬁc measure of how 'far away' the level dfx is from X. It ls: the squared diﬁerence between xi and x as a fraction of 55x (the total sum
of squared dlﬁerences between all data points and the mean). This is why the mean of an x values and 55X are Important, while the mean of all v values and
SST are not. 111:: standard error ofthe estimate, 5 The conﬁdence interval will be wider if there is a lot ofvariatlon in the sample between the values for v and the values predicted for v bv the prediction line. 111is
is because such variation would suggest that the regression is not very accurate. and the interval needs to cover more values in order to (with conﬁdence) cover
the expected value of v. The measure of varlatlcn of values for 5! about thelr predlcted values Is the standard error of the estimate, s. The predide value, §i At a given level of x. sav xl. v isa random variable with expected value Bo + le|. Here Be and BI are parameters estimated by the statistics [)0 and [11. You use
the prediction line vi = b0 4- blxi to predict the value for v. and so ya is an estimator for the expected value of y at x = in. As such, vi is always at the center of
the confidence interval for E0”. Sample size and level of signlﬂcanee As usual, the sample size (n) and level ofsigniﬁmnoe (a) will affect the conﬁdence interval construction. And as usual, an increase in sample size will decrease
the width of the confidence interval, and an increase in the level of signiﬁcance (and decrease in level of conﬁdence) will also decrease the width of the
conﬁdence interval. In particular. as n increases. l/n decreases, thus reducing the width of the interval. its a increases. the critical value to” decreases, thus a1sd reducing the width ofthe interval. : 3 o! 3 ID: MSTSLEAV.03.0010 A regression model has been developed to analyze the relationship between a dependent varialie v and an independent variable x. A prediction interval is to be
constructed for an Individual value of y for a given value of x. Seled: whether each mange would increase, decrease, or not affect the width of the prediction interval. Increase Decrease Not affect a} Decreasing the level of confidence l3} Using a larger sample size . I c} Choosing a value of it further away from i ' I d) Havlng less variation In the values of the dependent variable about the predlction line - I [3 outo‘l' 4] a) You are correct.
b} You are curred:—
c) You are correct.
d} This Is not correct.
Having less variation in the valua of the dependent variable about the prediction line would decrease the width of the prediction interval. Discussion Discussion The effects these changes would have on the prediction interval for an individual value of y at a given level of x can be investigated in two ways. One way is to
look at the formula for the prediction interval. By observing how factors occur in the formula, you can determine how changes in those factors will affect the
interval. Alternatively, you can argue 'why' the changes should affect the interval in the way that they do. Suppose the prediction interval Is to be constructed at the value x - XI, and the size ofthe sample drawn is n. The 100 x (1 - om: prediction interval for y is:
in i wanna/(1 + hli
where: yi the predicted value for y when x = all
tauurz = the 0/2 critical value of the t distribution with n - 2 degrees of freedOm s = the standard error of the estimate
. _ — 2
hi = l + (xi )0
"' ssx So the width of the interval is determined by several factors. Conﬁdence By decreasing the level ofccnﬁdenoe (100 x (l - c)%). you are increasing the level ofsigniﬁcance. c.111e critical value in any t distribution is decreased if the
level of significance is increased. Therefore decreasing the level of conﬁdence will decrease the width of the prediction interval. ‘Ihis can also be explained by the fact that any prediction interval should decrease in width if you are going to decrease the level of conﬁdence you want in that
interval. In other words. if you can handle being less sure that your interval does indeed contain the value of y. you can have a shorter interval. Sample size ‘lhe larger your sample size (n) is, the smaller lfn is. The h. term in the calculation ofthe prediction interval is the square root ofa number involving lfn. and so
decreasing lfn will decrease hi. Therefore increasing the sample size will decrease the width of the prediction interval. This also makes sense if you consider the fact that having a larger sample dze means that you have more information about the population being considered
(the dependent variable y). It is a general fact about prediction intervals that larger sample sizes will typically shorten the wicth of the interval. Changing the value of the Independent variable The above formula for the prediction interval for y is deﬁned for the ﬁxed value x. of the independent variable, x. The value that is awmed for the independent
variable will deﬁnitely change the position of the interval. In particular. the center of the interval, yi. is determined by xl. But changing x.- will also affect the
width ofthe interval. In particular. the difference between X; and the mean value ofx in the sample. i. will be a factor in the width of the interval. The term {xi - if appears in the numerator of hi In the formula for the width of the Interval. Therefore having a value xi further away from i will increase the width of
the predctlon interval. The qualitative reasoning behind why valum closer to the mean of x will give more accurate estimations of the mean of y is not as simple as for the other
factors The reason has to do with the fact that prediction lines_are straight lines Suppose for the moment that you actua_lly have two prediction lines coming
from two different samples that have the same sample mean X for the independent variablearld the same sample mean ‘r for the dependent variable. It is a
property of the regression coefﬁcients that both prediction lines will pass through the point (KY). And since they are both straight lines. as they move away
from this point they will move away from each other more and more. So. for a value of x that is far away from X, the two prediction lines will have very dilferent
values for y. In other words, there is more variation in the predicted values of y for values of x that are far from the mean. Variation in line dependent variable The standard error of the estimate. 5. is a measure of the difference between the values of the dependent variable that occur in the sample (y.) and the values
predicted by the prediction line (yl). If this variation increases, then s increases. Therefore a decrease in the variation of values of y about the values predicted
will decrease the width of the prediction interval. The reason for the decrease in the width of the prediction interval can be explained by the following fact. If the values of the dependent variable do not vary
much from the valua prediIXed for it. your regrﬁsion model is an accurate model in the sense that your ability to predict things about the dependent variable
{based on the indepeth variable) is large. So in trying to predict a value for y at some level ofx, your preciction wﬂl lﬂ(ely be more accurate. That is, the
prediction interval is narrower. :I 1. M3 ID: HSTSLMV.05.0010 Filby develops a regression model to analyze the relationship between the amount of time he studiﬁ for an exam (x) and the mark he gels in an exam (y). He would
like to construct a conﬁdence Interval for the average mark he would get If he studied for 8 hours. He would also like to construﬂ a pl'EtlﬂlOl’l Interval for the mark
he would get if he studied for 3 hours. ‘lhe levels of confidence [or thse Intervals are the same, and the same sample is used for bath intervals. Select the correct statement regarding these two intervals: The two intervals have the same width The confidence interval for the average mark will be wider than the prediction interval for the mark
The prediction interval for the mark will be wider than the confidence interval for the average mark
There is not enough information to cell which interval will be wider :| Feedback to out of 1] This is not correct.
The correct statement regarding these two intervals is: The predcﬂon Interval for the mark will be wider than the conﬁdence Interval for the average Ina rir. Discussion EI‘Hmlﬂng means and predicting values Estimating means and predicting values
If you develop a prediction line for a relationship:
§l= be + '31)! then you can use unis line to predict values of the dependent variable. v.1’ou can also use this line to calculate the expected value of y. This distinction can be
subtle. For a ﬁxed value of i: (say XI], y is a random variable. This should not be too surprising, since a fundamental rule in statistical regression is that your predictor
variable (x) will not. give you an exact value for your dependent varlabie. Now. given that y is a random variable for x = II. you can calculate Its exoeaed value.
This is your alwlated mean value for y at that level of x. But since the prediction line involves the statistil: be and II}. this is only a pith estimate. The ﬁnd: that y la a random variable for v = x. also means that you can predict a value that y will take. This is not the same as trying to estimate the mean, this is trying to actually predict the value that the random variable will assume. Having said that, It should not be too surprising that the point estimate for the
expected value and the prediction for an individual value are the same: on + b110,. The dlﬂerence between the estimate of the mean and the prediction of an
Individual value oedema more dear when you construct Intervals around this value. Intuitiverr the 'prediction' Interval for an individual value of y (at a Iboed value of x) will be rather wide. This Is because you are saying: ’I have a random variable
and I want an interval in which 1 am fairly conﬁth the variable will taloe a value.I On the other hand. the conﬁdence interval for the mean should be narrower.
since you are onlyI trying to estimate the mean of all values that y might assume {at that level or x). 111i: is indeed the case: the prediction interval for an intividual value will ahvays be wider than the conﬁdence interval for an average value. Mathematically this
can be seen in the fact that, with a level of conﬁdence of o and a sample of size n, the widths oi the depiction Interval for y and the conﬁdence Interval for Em
at x - vi are: Width ol prediction interval for y: 2 x t mzsv'u + h.) Width of conﬁdence interval for Ely): 2 x tons v’h. where: to}; Is the cf2 critical value of the t distribution with n - 2 degrees of freedom 5 is the standard error of the estimate + (XI - i? hl = 1
I1 ssx :I 1 of 3 ID: M5T.5LR.nv.os.oo2o Both the conﬁdence interval and the predation interval are vital in analyzing simple regression. and they have their own Individual purpose.
a) The conﬁdence interval is always the prediction interval at a given level at signiﬁcance [other than D and 1). b) The point estimate of the mean value I5 always the point estimate of the predicted Individual value. _| Feedback [1 out of 2] a) You are correct.
a) This is not correct. The point estimate of the mean value Is always equal to the point estimate or the premixed individual value. Calculation a) The prediction interval looks at a prediction 01 an individual value rather than the mean of the variable. It Is therefore wider than a conﬁdence interval because the mean of a group of values is usually less extreme than the values themselves. an individual prediction can take on a much more extreme value than a mean of these possible values. Themfore. the conﬁdanoe interval. which is concomed with the mean. is always narrower than the pI'EdiCliUn
interval at a given level of signiﬁcance (other than 0 and 1). Note that in the rase of a level or signiﬁcance or I] (corresponding to 100% conﬁdence) both inbervals are the entire real line from negative inﬁnity to
positive inﬁnity and In the case oi a level of signiﬁcance all (corresponding to 0% conﬁdence) both Intervals are simply the point estimate. b) One of the interesting things about the simple Iinr regression model is that the point estimate of the mean of the response at a gven xl is always equal to
the point estimate of an individual value. The reason why this is so is beause it conforms with the intrinsic workings of the simple linear regrﬁsidn rnddel. Under the simple linear regressiOn modelr the value of the response (y) is equal to a constant value (Bo) and a value induced by its relatIOnshlp to the
independent [or explanatory} variable (51):) plus an error term (e): Calculation a) The prediction interval looks at a prediction at an individual value rather than the mean of the variable. It is therefore wider than a conﬁdence interval
because the mean of a group of values is usually less extreme than the values themselves. An individual prediction can take on a much more extreme value
than a mean of these possible values. Therefore. the conﬁdence interval, which is concerned with the mean, is always narrower than the prediction
interval at a g'ven level of signiﬁcance (other than CI and 1). Note that in the case ofa level ofsigniﬁcance of D (corrﬁponding to 100% conﬁdence) both intervals are the entire real line from negative inﬁnity to
positive inﬁnity and in the case of a level of signiﬁcance of 1 (correspondng to 0% confidence) both intervals are simply the point estimate. b] One of the interesting things about the simple linear regression model is that the point estimate of the mean 01 the response at a given X] is always equal to
the point estimate of an individual value. The reason why this is so is because it conforms with the intrinsic workings of the simple linear regression model. Under the simple linear regression model, the value of the response (y) is equal to a constant value (Bo) and a value induced by its relationship to the
independent (or explanatory) variable (pix) plus an error term (a): if = 30 + 51* + 5
By taking the expected value of both sides (for a particular value of x) one obtains the expression of the mean:
El'le = Xi] = EiBo + Bixi + E] = EiBo] + E[I31Xi] + Eizl
= Bo + Bm The point estimates of Ba and B; are b0 and b1 respectively. Therefore the point estimate of the mean is equal to the point estimate of [so + 81x. which is
equal to be + b, xi. However. thls expresion is the equation ofthe regresslon Ilne whlch gives the pclnt estimate or an individual predicted value. Hence the
point estimate of the mean value is always equal on the point estimate of the predicted individual value. :I 3 of 3 1o: MST.SLR.AV.04.0019 You have calculated the lower bound of a 99% prediction Interval at x = 5 to be 6.91 and the corresponding upper bound to be 1 8.29, from simple linear regression
analysis. Select all the following mnduslons that may be drawn from this Interval: One can be at least 99% conﬁdent that the prediction interval includes the mean of the response variable at x = 5 {u m = 5}
J One can be at most 99% confident that the prediction interval Includes the mean of the response variable at x = S (uﬂx = 5)
One can be 99% confident that the prediction interval includes the mean of the response variable at x = 5 (ule _ 5)
x” One can be 99% confident that the prediction interval includes the individual value of the response variable given x = 5 (yx=5l [1 out of2] You are partly correct. - One can be at least 99% conﬁdent that the prediction Interval Includes the mean of the response variable at x - 5 (um,r = 5): this option
should have been selected. - One can be 99% conﬁdent that the production interval lncludu the Individual value of the response variable given x = 5 {yﬁs‘}: you are
correct. - One can be at rnoet 99% conﬁdent that the prediction Interval Includes the rnean of the response variable at x = 5 (um . 5): this is not
correct. Discussion The prediction Interval is given by the following formula: Mariam
; 1 tons Vll‘l'hl) D is: use ion The prediction Interval is given by the following formula: show variables
A
v e taps \l(1+h|) Where: _—2
hi =1 + (xi x)
n 55X In comparison, the conﬁdence Interval is given by the following formula: i} t taps v'hi a prediction Interval ls used to look at the possibilities that a predicted individual value can take. A99% prediction Interval between 6.91 and 18.29 states that
one can be 99% conﬁdent that this interval Includes the individual value. If the interval was a 99% conﬁdence interval instead, it would state that one can be
99% conﬁdent that this interval includes the mean of the rﬁponse variable at x = 5. Furthermore, the prediction interval is always wider than the conﬁdence interval on the same variable. such that its lower bound is lower. and its upper bound is
higher. a result of this is that wlth a c% prediction Interval, one would be at least c% conﬁdent that it will also contain the mean of the response variable. 111ls is
because a c‘ll: prediction Interval Is always wider than a c% conﬁdence Interval. with the exception of a 100% {where the ranges are from positive to negative
inﬁnity) and 0% (no range) interval. what this means is that fora 99% prediriion interval, one would be at least 99% conﬁdent that the interval would also include the mean, since it is wider than
a 99% conﬁdence interval. ;I 2 of 3 ID: M5T.SLR.AV.94.0010 You have calculated the lower bound of a 95% prediction Interval at x = 2 to be 1.45 and the corresponding upper bound to be 11.71r from simple linear regression
analysis. Select all the following conclusions that may be drawn from this Interval:
J One can be at least 95% confident that the prediction interval includes the mean of the response variable at x = 2 (u m = 2]
of One can be at most 95% conﬁdent mat the prediction interval includes the mean of the response va...

View
Full Document

- Fall '13
- ChristaLSorola