This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam
December 9, 1999 [revised 12/ 01] Exam is open notes and' open quaboolﬁ,
It lasts 150 minutes and' tﬁere are 150 points. SHOW WORK! 1. (18 points) Emails arrive at Prof. Stedinger’s mailbox like a Poisson process
with, on average, 12 emails arriving during the 10 am to 4 pm period. a) What is the mean and standard deviation of the number of emails he
gets between 10 am and noon. b) What is the probability an afternoon (noon  4 pm) passes without a
message arriving? c) What is the mean and standard deviation of the waiting time until the
next 5 e—mails arrive? d) What is the probability Stedinger needs to wait more than an hour for
two emails to arrive? e) Why is a Poisson process model appropriate for this problem? f) Actually, Stedinger's Eudora program queries the Cornell post office every
ten minutes and down loads any new messages. Does this effect the validity of
the Possion process model for arrivals at Stedinger's computer. Why or why not? 2. (15 points) A quarry provides decorative stone cubes (almost cubes) for use in
gardens. The three dimensions of the cube 1, w and h are each lognormally
distributioned with mean = 1 foot, and standard deviation = 0.02 feet. What is the
mean and standard deviation of the volume of the cubes if the variations among
the values of l, w and h are independent? [Note V = lwh .] Is the independence assumption important (explain why or why not)? 3. (15 points) For a design you need to develop a frequency distribution for max
annual 24h rainfall depths. From regional maps you determine that the median
rainfall is 1.9 inches, and the rainfall depth exceeded with a 10% probability is 2.7
inches. Using a Gumbel distribution, what rainfall depth is exceeded with a
probability of just 0.5% (and thus might be called the ZOOyear rainfall)? 4. (5 points) Elizabeth is studying Cornell Lake Source Cooling project impacts.
Using measurements for 8 different years, she computed that a 95% confidence
interval for the true averageSeptember lake temperature was 52.8  57.1 °F. Her
boss has never taken a course in statistics. How would you explain to him the
meaning of this interval? What does confidence mean in this situation? Why
do statisticians compute confidence intervals? CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam
December 9, 1999 [revised 12/01] 5. (15 points) Yannis needs to evaluate how much CEE graduates know about
environmental healthrisk issues. To do this he wants to compute the average
score of CEE graduates on a short questionnaire. However, 20% of CEE graduates
took Prof. Stedinger's wonderful springsemester course: CEE 597 Risk Analysis
and Management. They should be more informed and thus get higher scores on
the questionnaire. His estimate of the mean score of CEE graduates will be 5 = 0.2 >‘< + 0.8 7
where 32 is the mean of 10 randomly selected graduates who took CEE 597 and Y is the mean of 10 randomly selected graduates who did not take CEE 597.
Assume that for individual test scores: Xi ~ N(ux, 0X2) and Yi ~ N(uy, 0Y2). <Students failed to compute explicitly 8: correctly the true mean score for a random student.—]RS>
a) How large is the bias in S as an estimator of the true mean score of all graduates?
b) What is the variance and Mean Square Error of S?
c) Does 5 reﬂect any particular sampling strategy?
What might be the advantages / disadvantages of that strategy? 6. (17 points) Students in the CEE materials course need to develop the strongest
weight concrete they can for a special application. Two teams developed competing
formulas, and each produced and tested their own 7 samples to obtain strengths: Sample Team Green Team Blue
1 5700 4850
2 6300 4900
3 6100 4300
4 5300 4200
5 6250 3400
6 3650 4500
7 £9.50. 529.0.
Average 5464.3 4478.6
Stan. Deviation 945.0 592.9 Is there a difference between the teams? Use an appropriate Wilcoxon test:
(a) What are the appropriate null and alternative hypotheses?
(b) What is the rejection region for a Wilcoxon test with a type I error of 5%.
(c) What is the p—value for this data set? What do you conclude? (d) What is the pvalue for the corresponding 2sample t test? (e) For a t test with a = 5%, n6 = 113 =15, when uG —uB = 1000 and (IQ = 03 = 600;
what is the type I and type II error? [This problem was changed in 12/ 2001.] CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam
December 9, 1999 [revised 12/ 01] 7. (30 pts) The Society of Women Engineers decided to help its members improve
their study skills by sponsoring a speed reading course. The course claims that
participants on average will increase their reading speed by at least 75 words per
minute (wpm). Here are the before and after results for the 16 students. 1 Sally 644 748 104 29
2 Jill 448 535 87 12
3 Linda 363 395 32 43
4 Rachael 448 531 83 8
5 Phyllis 385 453 68 7
6 Carolyn 665 757 92 17
7 Carol 504 588 84 9
8 Sissy 399 476 77 2
9 Bonnie 674 768 94 19
10 Peggy 553 644 91 16
11 Debbie 660 752 92 17
12 Wenbi 532 621 89 14
13 Barb 448 524 76 1
14 Susan 476 556 80 5
15 Gina 685 747 62 13
16 My 511 as 21 is
mean 514.2 596.2 82.00 7.00
standard deviation 109.5 122.9 17.05 17.05 a) What are the appropriate hypotheses for a nonparametric Wilcoxon test? b) For a = 5%, what is the rejection region for the Wilcoxon test?
c) What is the p—value for this sample using that test? d) What are the appropriate hypotheses for a Sign test?
e) What is the pvalue for this sample using a sign test? f) What are the appropriate hypotheses for a t test?
g) What pvalue do you obtain? h) With a = 5%, if the true mean of the differences were 100 wpm and true
standard deviation were 25 wpm, what is the type II error for a t test ? i) What is a 99% confidence interval for the true reading speed of the women
BEFORE they took the reading course? (Assume data is normally distributed.) j) Is it more appropriate to use a sign, t, or Wilcoxon test for this data set? Why? CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam
December 9, 1999 [revised 12/ 01] 8. (35 pts) A transportation research center is studying the impact of traffic
volume and axle weight on road wear. Let Di be their index of the cumulated loading due to traffic, and Wi the observed road wear. Values of D and W were
collected for several test tracks with a new asphalt formulation yielding: Di = 250 325 410 1840 kmiletons
Wi = 0.41 0.52 0.64 13.4 mm
W=6.80 1‘) = 630
sw = 3.10 $1) = 320
2 (Di — D )(Wi  W) = 11,240 kmiletons—mm
n = 18 a) What are the leastsquares estimates of a and [3 for the linear model
W = a + [3 D + E ?
where E represents the unexplained error, or lack of fit, for this model.
b) Please compute an unbiased estimate of the variance of the errors E?
c) How much of the observed variation in W is explained by your model?
d) If D = 0, then one expects that W = 0.
Is the data consistent with that hypothesis? (explain)
<Almost no one correctly cast this as a hypothesis test as was illustrated
in lecture and was required in the project. So most people lost points!  IRS>
e) A new observation arrives with D = 1020 kmiletons, but the value of W is
missing. Please construct an interval which with confidence/ probability of
95% contains the missing W value coresponding to D = 1020.
f) The 18 residuals from the regression analysis were plotted against <I>'1[ i3/ 8/ (n+0.25) ] where i is the rank of each residual and (I) is the cdf of
the standard normal distribution. A value of 0.891 was obtained for the
correlation coefficient r between the two sets of values. Explain what this
computation suggests? I hope your other tests go well.
Have a happy holiday. CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
2—sample t solutions Finals 199899
(Corrected 12/ 2002) CASE III: ox and cry unknown, small samples (Devore § 9.2) Final 1999. Problem #6 Twosided t test. Zsample treated as if unequal variances.
Sample mean 5464 4479
Sample Standard dev 945 594
n 7 7
SE delta(means) 421.9
T 2.34 p—mlue 0.021 one side
Df 1010 DF rounded 10 alpha H ”
t 2.228
True mean 5000 4000 True Standard dev 600 600
n 7 7
True SD delta(means) 320.7
[3: 0.12
For power calculation need Based upon normal approximation.
d 0.94 =<I>[ z(alpha/ 2)  sqrt(df+1)*  d  ]
use with df = 10 Using tables in book get [3 = about 20%
Finding required sample size would take some trial and error eﬁ‘ort. Too hard ﬁ)r ﬁnal.
Final 1998. Problem #9 Twosided t test. 2sample treated as if unequal variances.
Sample mean 25.23 44.28
Sample Standard dev 18.56 28.68
11 8 8
SE delta(means) 12.08
T 4.58 pvalue W
Df 11.99 alpha .0 ’
DF rounded 12 t 1.782
ACCEPT Ho
True mean 20 45
True Standard dev 22 22
n 8 8
True SD delta(means) 11
[3: 0.27
For power calculation need Based upon normal approximation.
d 0.63 = d>[ z(alpha/ 2)  sqrt(df+1)* I dl ] use with df = 12 Using tables in book get [3 = about 30% CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions Final Exam 1999 (revised ‘01) 1. Poisson process. Compute x = 12/ 6 = 2 arrivals per hour.
a) For Poisson distribution, mean = u = v = At = 2*2 = 4; SD = sqrt(v) = 2 4 pts b) k=2;t=4;Pr[K=0v=8]=exp[—8]=0.000335 2pts
c) Gamma distribution (not required). k = 5; A = 2. E(T5) = 5/ 2» = 2.5 hrs; Var(T5) =5/7»2 = 5/4 = 1.25 .> so = 1.12 hrs 4 pts
d) Use Gamma QR Poisson dist. Poisson: Pr[Wait 2 1 hr for 2 messages ] = Pr[ K = 0 or 1 Iv =2] = e‘V + ve‘v = (1+2)*0.135 = 0.405 3 pts
e) Mail arrives singly (one at a time), arrival rate is constant, number in non
overlapping intervals are independent (for the most part). 3 pts
f) Trouble because when computer down loads likely to get more than one at
once. Thus mail no longer arrives singly at Stedinger's computer. 2 pts 2. Lognormal: do not needs all of log—space moments, but can calculate them.
02m = ln[1+(sL/uL)2] = (0.02)2; uL= 1 > ulnL .—. ln(1)0.5*021nL = 0.0002 6 pts
1n V = In L + an + In H > Independence => Variances add. an ~ Normal( sum, SozlnL ) = N[ —0.0006, 0.0012] 3 pts
#v = expwlnv + 0.5 01an) = 1
02V = [uV]2 { exp(01nV2) — 1 } = 0.0012; SD = 0.0346 3 pts When one multiplies independent logormal random variables, the result is always a lognormal
distribution. However, if In L, an, and In H have a joint normal distribution with correlation
among those values, In V will still be normal, and V will be lognormal. However, the variance of
In V would then also depend upon the covariances. THUS the independence assumption is very
important. Positive correlation among L, W and H could greatly increase variance of V. 3 pts 3. Gumbel distribution: need to manipulate Gumbel CDF to compute values
of parameters: xp = u  ln[ln(p)]/ a. Have 2 eqns. for p = 0.5 and 0.90 5 pt Yields a = 2.355; 11 = 1.744 5 pts
Then for p = 0.995 get x0995 =4.0 inches in 24 hours 5 pts [Gumbel is reasonable choice because considering LARGEST rainfall event in the year, and thus in
concept the largest of many storms that occurred. ] 4. Conﬁdence Intervals. Given the variability in the data we cannot precisely
determine the true mean value with such a small sample. Therefore statisticians
report confidence intervals to describe possible values of the true mean with a
given confidence, The intervals describe the precision with which the true mean is known. Here a 95% confidence means that when intervals are
constructed in this fashion, on average, 95% of those intervals (i.e. most) will
ACTUALLY contain the value of the true mean. This particular interval may or
may not contain the mean; we do not know. .GEE 304  UNCEBIAINIY ANALYSIS—IN ENGINEERING 5
Solutions Final Exam 1999 (revised ‘02) 55ampling/Estimators (a) Let R be the Score of a randomly selected Student. uR = E{R} = 0.2 ax + 0.8 ﬂy. The estimator of uR is S = 0.2 )_( + 0.8 Y . E{S} = 0.2 ux + 0.8 uy where X and Y are unbiased estimators of their means. Thus‘Snis an unbiased
estimator of “R. Or, the Bias_of_S = E{S} uR = O. 5 pts. (b) When Bias = o, MSE(S) = Var(S) = VAR[O.2 >‘< + 0.8 Y] = (0.2)2Var( 32 )+(0.8)2Var( Y ) = (0.2)2Var(X)/n+(0.8)2Var(Y) /n for n=10 6 pts (c) This is an example of stratiﬁed sampling 2 pts If done properly it should result in a smaller variance than would be obtained by
selecting 2n = 20 students at random. However, it done poorly it can actually
increase the sampling variance of the estimator, or even introduce bias if the
specified proportions 0.2 and 0.8 are wrong! 2 pts 6. TwoSample Tests a) Are the two teams producing samples with different strengths? H0: PC = FB No difference 1 pts
Ha: PG 7’: FB Need two sided test. b) Use Wilcoxon—MannWhitey Rank Sum test W = sum ranks of green (or blue), (or: 5%) reject if W 2 68 or s 7(7+7+1)— 68 = 37 4 pts
c) Observe W = 2+8+10+11+12+13+14 = 70. (Data set has 2 low outliers.)
p—value about 2(1.3) = 2.6%. Call it 3%. Conclude the teams different. 4 pts
d) Using two sample t test find t = 2.39 with deg of freedom = 10.1 ~ 10 4 pts
2—sided Pvalue = 4.2%. Conclude there is a difference at the 5% level.
e) With true 0’s find df = 28; d = (uG — uB)/sqrt[(df+1)(oG2/ 15 + 632/ 15)] = 0.86.
For 2sided 8: 0L=5% —> type I error = a = 5%; find type II error = B ~ 35% 4 pts 7. Paired Data => OneSample Tests
a) Use Wilcoxon Signed Rank Test on D = AB75. Substrating minimum gain. H0: median D = 0
Ha: median D > 0 (want to show that improvement 2 75) 2 pts (b) Sum ranks of the positive differences where big values have large ranks.
Reject Ho if 5+ 2 100 (from tables) 2 pts (c) Find 5+ = 108. p—value = Pr[ 8+ 2 108] = 2% 4 pts Sign test uses counts only.
d) Ho: p(+) = p(—) = 0.5; Ha: p(+) > 0.5 1 pt CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions FInaI Exam 1999 (revised '01) e) Observe 3 negative counts (13 positive counts)
p—value = Pr(K s 3 l p = 0.5]: 1.1% WHOW! 4 pt
(Computed binomial probabilities for p = 0.5, n = 16 and K = 0, 1, 2, 3.
Gave credit if people used tables for n = 15, but hard to get right result.)
Ignores that BIG difference of 43. Probably not a good analysis of data. (f) For a t test, assume data is normal and consider D + A  B  75 Ho: MD = 0 Ha: uD > 0 (want to show that improvement 2 75) 1 pts
(g) t = sqrt(16) 7/ 17.05 = 1.67 with df = 15. Onesided pvalue is = 5.8% 4 pts
(h) If true gain is 100, then excess over 75 is 25 wpm: d = (10075)/ 25 = 1. For df = 15, find that (5 ~ 1% (getting close to zero but not there). 4 pts
(i) 99% conﬁdence interval: toms,“ = 2.947 (1 pt): 99% CI = xbar i s t0_005,19/ sqrt(n) = 434 to 595 wpm. 4 pts (j) Differences do not look normal. Have a big negative value: 43. Therefore t
test is suspect. Use Wilcoxon test; does not assume normality but generally as
powerful as t test and more powerful than sign test because sign does use the
ranks of the observations. In THIS case Wilcoxon and Sign tests both give p value ~ 1%. 4 pts
8. Regression a) Least squares estimators: a = 2.732; b =0.0065 6 pts
b) Se2 = 5.675 = (2.382)2 ; se = 2.382 4 pts 2 _ _(Residual sumofsquares) __ _ _ 2 _ 2 _
c) Need R — 1 mama sum_of_squares) — 1 (n k)se /[(n 1)sy ] 0.44 5pts (Or square correlation coefficient; that works too.)
d) Need St.Dev.(a) = 1.27 where a is estimator of a. We need to make a decision.
Consider: H0: at = 0, versus Ha: a at 0. Compute t = (a0)/ StDev(a) = 2.15.
For 16 df and t = 2.15 for twosided test get p—value of a little less than 5%. Thus at the 5% level one can just conclude that a is not zero. With such a small sample a pValue of 5% is about all one can expect. 5 pts
Question calls for a hypothesis test, like lecture and project examples. e) For x = D = 1020 point prediction for y is 9.32, where a 95% prediction interval
for a future yvalue is 3.92 to 14.72, using t0_025,16 = 2.120 in 7 pts " 2
XX
a + bxi t0025,16 Se / 1+{;+—n(———)——
ZOEi? il f) Plotting x6) versus ¢'1(pi) yields a probability plotlx correlation of 0.891 <<1 is
very small: we can reject the hypothesis the data is normal at 1% level. 6 pts ...
View
Full Document
 Fall '08
 Stedinger

Click to edit the document details