This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
2003 Final Examination 9:00  11:30, Thursday, December 18, 2003 (revised) Exam is open notes and openbook.
It lasts 150 minutes and there are 150 points.
PLEASE  SHOW WORK! 1. (12 pts  2 pts each.) Short and trueorfalse answers a. The Sample Space is a collection of all possible outcomes of an experiment (T/F). b.The great power of the idea of an estimator is that we can determine which of several
estimation procedures will always give the better answer. (T/F). c. The Central Limit Theorem assures us that the results of experiments will always be normally
distributed (T/F). ' d.The sample variance is an unbiased estimator of the population variance. (T/F) e.The sample average of n observations has a standard deviation of 7 (Ans in exam bk.) f. For normal data, the standard deviation of the sample variance S2 is ? (Ans in exam bk.) 2. (6 pts) What fundamental problem does the collection of knowledge and mathematical relationships
called “probability theory” address? Why is the concept of a random variable so important? 3. (7 pts) A hydrologist is modeling the largest 6hour rainfall depth observed in each year of record.
Assuming that observed annual maxima have a Gumbel with mean 2.3 inches and standard deviation
0.8 inches, what rainfall depth has only a 0.005 probability (0.5%) of being exceeded in any year?
WHY is a Gumbel distribution a reasonable model for such phenomena? 4. (15 pts.) A BEE student needs to create a stock solution of a reagent that should have a
concentration of 2 mg/l. The previous night they were up to 5 am doing CEE 304 homework. As a
result, when they measure the reagent, the mean weight of the reagent they obtain is 4.00 milligrams
with a standard deviation of 0. 12 milligrams. They add the reagent to a measured volume of water,
which has a MEDIAN of 2.00 liters and a Coefﬁcient of Variation of 5%. Using lognormal models,
what are the mean and standard deviation of the concentration of the reagent in his solution? 5. (14 pts) Computer security is a tremendous problem. A civil engineering consulting ﬁrm has
installed new computer security software to protect their ﬁles and web site. They want their clients
to be able to access ﬁles describing the status of ongoing projects and proposed designs, but the
information should not be made public or damaged. The new software detects on average 12 serious
and independent attempts to penetrate the system every hour. a) Why might a Poisson process be a reasonable model of such attacks? Why might such a model not be appropriate? Assuming a Poisson process b) What is the probably that there are exactly two attacks in a speciﬁc 10 minute period? c) What are the mean and variance of the number of attacks in a week (168 hours)? d) What is the mean & variance of the time until the system experiences 1,000 attacks? Page 1 CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
2003 Final Examination 9:00  11:30, Thursday, December 18, 2003 (revised) 6. (20 pts) A common and natural problem addressed by statistical tests is whether two samples
show that the respective populations are different. Over the years Prof. Stedinger has come to believe that women are generally more
conscientious at doing homework than men. For an interesting example he grabbed some average
homework grades for assignments 111 in CEE 304 this year, excluded graduate students, and
sorted the scores by sex. The real numbers are: \OMNG‘UIhWNr—I Nu—ou—IHt—Iv—Ir—Iu—It—Ir—It—
OWOONQlthwND‘O average
st. dev Count Sum (a) What are the appropriate null and alternative hypotheses for a good nonparametric test? (b) What is the rejection region for a nonparametric test with a type I error of 5%. (c) What is the pvalue for this data set? What do you conclude? (d) What value of t do you obtain for a twosample t test? What are the effective degrees of freedom
(approximately)? What is the pvalue? (e) Would you recommend a nonparametric test or a t test with this data set? WHY? Page 2 CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
2003 Final Examination 9:00  11:30, Thursday, December 18, 2003 (revised) 7. (25 pts) Professor Watchum is concern with the ability of his students to use a new inexpensive
laboratory instrument. If they are not careful, they are likely to get answers that are too large. To
test his concern he observes results from different student teams using the instrument for several
experiments. Below are the values they should have recorded (measured independently by Prof.
Watchum with an expensive and 1911 precise instrument), and the values recorded by the students. True Value #— 2
3
4
5
6
7
8
9 H
0 Average
St. Deviation Last two columns provide the ranks of the absolute value of differences. (a) Specify appropriate null and alternative hypotheses for the tests in parts (b)(c)(d) below.
(b) What is the rejection region for a sign test with a type I error of about 5%?
What p—value do you obtain with this data?
If the probability of a positive Sign is really 0.8 (the model tends to overpredict), what would
the type 11 error be for your on = 5% test? [For n = 10, Devore has binomial tables. ]
(c) What is rejection region with on = 5% for the appropriate powerful nonparametric test?
What p—value do you obtain with this data?
(d) What is the rejection region for a t test with or = 5% for this data?
What pvalue do you obtain with this data set?
(e) With normal data, how large a sample is needed to ensure the type 11 error is < 2%
with the ttest when O'Diff is 100 and the mean difference is actually uDiff= 100 ?
(f) If the data is normal, for n = 10 how big a mean difference is needed to ensure B 13 O
with a t test were the standard deviation of the differences really equal to 100 ?
(g) For this particular data set, which test (sign, WWilcoxon, or Student t) would you
recommend and YLHX? (Type II engineering!) 8. (5 pts) Assuming the data in #7 are normal, use a Student t distribution to construct a 95%
conﬁdence interval for the mean difference between the measured and true values. Page 3 CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
2003 Final Examination 9:00  11:30, Thursday, December 18, 2003 (revised) 9. (8 pts) One of the students in Prof. Watchum’s class asked how Watchum knew the
differences were normal. (a) Construct a probability plot to allow him to examine the
hypothesis of normality (probabilitypaper attached). (b) The correlation between the sorted
observations and the nscores is 0.8656 — what should Prof. Watchum conclude? 10. (38 pts) Winter is here. A hydrographer is concerned with the variation of snowpack in
the mountains with altitude and exposure to prevailing winds. Just considering altitude for
slopes facing the source of moisture, she obtain the following data on the snowpack depth (D) in millimeters on the ﬁrst of May, and altitude (H) in meters: Altitude, Hi: 2440 3720 3420 3180 4150 1720 meters
Depth, Di: 1395 1304 872 1041 1556 622 millimeters
n=27 b‘= 1164mm sD=368
H = 2761 m sH= 727.8 2(Hi E )(Di D‘)=2,951,520 a) What are the 1east~squares estimates of the coefficients of the linear model
D = or + l3 H + 8
wherein the term 8 describes errors about the model prediction ? b) Compute an unbiased estimate of the variance of 8.
c) What fraction of the observed variability in snowpack depth is explained by altitude?
d) What is the value of adjustedR2, and why does its value differ from the number you reported in part (c) above?
e) What is the standard deviation of b, the estimator of [3?
f) A concern is whether there is any relationship between altitude and snowpack depth. What is the rejection region for a 1% test of whether or not B is zero? WHY have you selected either a onesided or two~sided test? What pvalue do you obtain with this data set?
g) What is a 95% prediction interval for the snowpack depth that could be measured at a particular site with an altitude of 3,000 meters? Tits ﬂas ﬁeen a fun class this year. Qﬁanﬁyou for joining me.
ﬁnd rememﬁer, wﬁil'e everyone need}; to ﬁe [iﬁe R06 andﬂrt sometimes,
tﬁe Micﬁefl'es are tﬁe leaders witli vision wﬁo maﬁe tﬁe world' worﬁﬁetter. Page 4 \1\\\ ‘ i‘ww \ x I: .‘w ‘\J . . . _ V. . ..
“ . ...V . .... .Z. .. V. .2.
I h... .. ..V .... .m. u. w .3“
.... u... .w M t“ . M. .... W. w I
.... ... .: ..... .. ..uu.d .:
. . . . . , . \
.... ...H V _ u.ﬂ.m .... 1," .. . — ...“...mimmtxx. _. V . _
u .. . _ .. ._. .. . .I . . I . . ... In.”
V,.. ....II. .... .. . . .. ... . . . .
.... ......T ....V. .... .V * I I
. V ... .. . I
.m.. ..."... u. . . ._ . . I
. .. . .. . .. ._ . .V. .
.. .....V. _. .... ,. . , .. . . . . I
. _ .. ..I. . .I.
V... .... .. _ .. .... . ._ “I ._“u .m” . .
.... .. . . ... ._ ... ..V _ f ...m.. . a v
V... ....III. .V. .. .... .. m .11.. .,. IV . . .I V .... ...V. . .
. .. 7:... ..V. ... .. . .5.“ . d. . _ . I V .. .. .I . . .. ... ... .I j. n .. 0...”:
.n.. ..I. ..I. . .. .. . : . .... ..u __
.... .... ....._. .... ...} .I .
..am .3 ..... ..I. l." .... . ..... . I .. . ... m4..I:.V.. .v~
L ._ I . ,I .. I . .
. u. .... .... .... .... I. w .v .. 4
n ..M“. I.. "4.“ ... . ... a b V
..I . ..u.. .... ...... v m . ..
.... .V. ..., ...” V. ......d . ..
_ .I _ I . .
..m. I .. m... ... MM.“ .aV ....N a a a_ _
... H_.I .... ...... . u . .... ... M. . “4.. _« .... ..mﬂ .t...
.. .. ...V . . ._ ,V. ... ...... w I I . ,. ... , .II.,.. ...V ..d. V. w . : .... ....u.
.... ...”...I .I." .... _. . ..:. . n _“ . .
.... m..._;V .... ....V I... .... ...: .. m ...
.. .... .. .ﬂ.. .... .. q... .. ,.. ...m .. . I...
m... I... Z .... .. . . . . . . .
... ..V. VI .. . ... ..
. . .... .I.. ._.. .V .
_. .. ..... .. .m . w
. _ ._ ._ . . I
_ .4 ...“ .. .... V H... N _ _.
. . . . . _. . . I
..........V. ..... .... ......w _ V .
. . . . V V . . I .
..W. : .... Nd“. ..... an“. .... m... _. —
.."... ...I.. ..n.. a.“ ... .I. ~ I
. .. ...: “V. L . .
..... .... ... .... .. . .
. I . V . . I.. .V .... . .
Z .I . . . .. _ .
. m ... ,... ... ..... ... n. ....
.. .... ..m...I..".I...... .. ......
a... ...... ......«..... .. . .. I... n.“
_ _
... _ .
_ III. =__=__—_a==
IE=_____===__===
a , Ea_=_____an—=__===
_. .,______ﬂ____ _=_ 555:
=___=u_________________Mm_mmmm
Egg—En Emu”
_ n_________=m ., ﬁﬁﬁﬁm
.—___._.____________m_u___m____m
. . 3.. ... 2 a 2. .... a...
33:: 2.22.. a = :===__= lllllnllllllnllllllul Q EV 2 F 2 m0 DR. MI. .. .... 3.5;... Eliii 2925.58 we...on 922% El —._, CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions FlnaI Exam 2003 1. Short answer:
True, No (for a articular sample, different things can happen),
No, Yes, 6/ n, sqrt[204 /(n1) ]. 2. Probability: When the outcome of an experiment (whether it be a measurement, an act of
nature, or a game of chance) is uncertain, we need a language to describe the
likelihood of different outcomes, or the uncertainty associated with the outcome of
the experiment. Probability addresses how from the rules by which points are
selected in a sample space we can compute the likelihood of different sets of
possible outcomes. A random variable is a numerical measure of the outcome whose probability
distribution (describing the likelihood different values occur) can be determined. It
is a realvalue function which assigns a real number to every point in the sample
space. Thus it links in a valuable way the probability of different events to
numerical values addressing issues of concern in engineering, and thus allows us
to compute the probability that various numerically described outcomes might
occur. 3. Gumbel: 7 pts maximum: points lost if answer not provided.
Given u = 2.3 and o = 0.8, need to get Gumbel: xp = u  1n[ln(p)] / a. First 62 = (n/ 6a2) => a = sqrt(n/ 6)/ 0.8 = 1.603 2 pts
Second u = u + 0.5572/ a => u = u  0.5572/ a = 1.940 2 pts
Then for p = 0.995 get x0.995 = u  1n[ln(p)] / a = 5.24 3 pts
Gumbel appropriate because the largest of many values, whose distribution is unbounded above. 2 pts 4. Lognormal  This is just like lecture example, and other examples. First get logspace moments. Consider the Weight W of reagent, and the Volume V
of water into which it is mixed, where as a result C = W/ V. 0211‘“, = ln[1+(ow/ uw)2] = 1n[1+(0.12/ 4.00)2] = (0.030)2; H an = ln(ﬂw) ' 05 02111 => U 1nW = 1.386 5 Pts
For C it is easier because u luv = ln(median of V) = ln(2) = 0.693
621,“, = ln[1+'('0;05)2] = '(0.‘050)2; 5 pts
in C = In W — an + > Independence => Variances add.
lnC ~ Normal(u lnw — u lnv ‘, 62W, + 621,“, ) = N[ 0.693, (0.058)2 ] ' 1 pts
uc = exp(u me + 0.5 62m) = 2.002
02C = [uc] 2 { exp(62,nc) — 1 } = 0.0136; CC = 0.117; CV = 0.058 4 pts When one multiplies independent logormal random variables, the result is always
a lognormal distribution. This a nice model for products and ratios. CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions Final Exam 2003 5. Poisson process.
a) Would be Poisson if arrivals are independenly over time with constant rate.
Would not be Poisson if arrival rate changes over time, or if attackers coordinate their attack for some period of time. 5 pts
b) For Poisson process, 7» = 12 per hour. Pr[ K: 2  V: 12(1/ 6) = 0.1667]: v2 exp(—v) / 2! = 0.27 3 pts
c) Number in a week has a Poisson distribution u =oz=v=M=12(168)=2016 3pts
d) Gamma dist. (not required). k = 1000; E(T50) = 1000/ 7L = 83.33 hrs; Var(T50) = moo/7.2 = 6.94 hrs2 —> SD = 2.64 hrs 3 pts 6. TwoSample Nonparametric and t Tests
a) H0: Mwomen = Mmen No difference in medians 2 pts Ha: Mwomen > Mmen Women do better. Men stochastically smaller. b) Use Wilcoxon—Mann—Whitey Rank Sum test. W = sum ranks of women.
Expect them to have large ranks. E[W] = 323; Var[W] = 1076 =(32.8)2 Reject H0 if 2mt > 1.645 or equivalently W 2 323 + 32.8(1.645) = 377 4 pts
Wherein Ztest = [ W — EW]/ SD[W]; EW = 323, SD[W]=32.8
c) Observe W = 377. Pvalue = 5%; Barely REIECT Ho 2 pts Those women are conscientious.
d) Lots of computation. See formulas in notes.
Rejection region T < — tam4 = —1.691
Find t = 1.52; doff = 33.88 ~ 34. Pvalue 7% (onesided) 8 pts 6) Why should this data be? Look at the terrible scores some students get: 9 points and even the 403; while many others in 808, 90s and approaching 100. Data
clearly not normal. Use a nonparametric W test and avoid unneeded and
incorrect assumption of normality. T test would not have the advertised type I
error. Moreover, W test likely to be MORE powerful that a t test. 4 pts 7. Paired Data => OneSample Tests
a) Different answers for each part b) Sign test uses counts only. Consider S, the number of negative signs. 6 pts
Ho: p(+) = p() = 0.5; Ha: p() < 0.5 From tables for (X ~ 5%, reject if S s 2 [actual o: = 5.5%]. 1 Observe s = 2, p—values = 5.5%
Were 8 ~ Bin(10, p = 0.2), B = Pr[ S 2 3 ] = 1 0.678 = 32% CEE 304  ‘UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions FInaI Exam 2003 b) Use Wilcoxon Signed Rank Test 6 pts
Use test on differences D = ( Observed  Estimated ).
Sum ranks of the positive differences Ho: median D = 0 01' Ho: Pr[Value = FMeS Both imply no difference.
Ha: median D > 0 or Ha: FTValue > FMeS where big values have large ranks. Reject H0 if S+ 2 44
Observe 8+ = 47; Pr[ 5+ 2 47 ] = 0.024 => Pvalue = 0.024. REJECT d) For a t test, assume data is normal and consider 5 pts
HO: [1 D = 0; Ha: [I D > 0; Reject if T > t0.05,9 = 1.833 t = Ji—o (90) / 175.9 = 1.626 with df = 9. p—value = 0.0692 = 7% from Excel. Must accept HQ. _
That is interesting. W test sees a very signiﬁcant result. T test does not.
The data is not normal so t test is not very eﬂicient, if it is even correct. c) For d = uD/GD = 100/ 100 = 1 and 0; = 5%, with a 1sided tto get {3 < 2% need df = 14, hence need n = 15 (using Table A—17) 2 pts
f) For a = 5%, n = 10 and a 1sided 't to ensure'B ~‘0% need 'd =" 1.4
(find from table A.17) so that uD = 1.4 (ID = 140 2 pts
g) Use the nonparametric signedrank test — clearly the data are 4 pts not normal which rules out a smallsample t test: it is neither valid (type I error
will not have anticipated value) or efﬁcient; sign test is not very powerful:
ignores magnitude of differences. 8. Conﬁdence Intervals
toms,9 = 2.262 so a 95% CI is 90.4 :I: 2.262 (175.9)/ J1— : —35.4 to 216.2 5 pts
BUT I would not BELIEVE this CI  data is not normal. (Or CI is: —216 to +35) 9. Probability plot here is the data using using pi = (i 3/ 8)/ (n + 1 / 4) pi Z(pi) X0)
1 0.061 ~1.547 102.8
2 0.159 1.000 6.2
3 0.256 0555 18.0
4 0.354 0.375 20.2
5 0.451 0.123 20.3
6 0.549 0.123 23.1
7 0.646 0.375 27.3
8 0.744 0.655 169.2
9 0.841 1.000 212.1
10 0.939 1.547 522.9 Corrrelation is 0.8656; this is less than the critical point for 1%: Pvalue < 1%! CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions Final Exam 2003 Probability Plot Watchum Data Set So reject normality! This data is not normal. 10. Regression a) Least squares estimators: a = 573; b = 0.214 8 pts
b) S; = 115,800 = (340)2 4 pts
c) R2 = 1  (nk)sezl [(n1)sy2] = 0.179 4 pts d) 732 = 1  se2/ sy2 = 0.146; has a smaller value because it corrects for the number of coefficients estimated. Good for multivariate regression 4 pts
e) Compute St.Dev.(b) = 0.0917 4 pts
f) Reject Ho: [3 = 0 versus Ha: B > 0 if T > tQOL 25 = 2.485
I used a onesided test because almost always snowpack increases with altitude. But others may use a two sided test if they did not have this prior belief. Observe t = 2.337. pvalue close to 1.4% (> 1%) so accept Ho. 7 pts
For twosided test, t0.005l 25: 2.787; p—value for 2.337 is 2.8% g) For H = 3000 m the mean value ofD = a +b 3000 = 1,216 mm; a 95% prediction interval for a value of D to be measured at H = 3000
at this value of w is 500.5 to 1,930.1, using t0.025,25 = 2.060 (w/ SE from the mean of 347.2) in the equation: I 1 (AI—17:?)2 V
a + bHi 0102535 se 1+" + 2(HFJ—UZ 7pts NOTE: 95% for the mean when H = 3000 is 1,073 to 1,358, which is smaller! ,/77 ...
View
Full Document
 Fall '08
 Stedinger
 Normal Distribution, pts, Gumbel

Click to edit the document details