This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam
Wed., December 16, 1998 (revised '01) Exam is open notes and open—book.
It lasts 150 minutes and there are 150 points. SHOW WORK! 1. (8 points) On July 28th, 1997, torrential rains pounded Fort Collins, Colorado,
devastating parts of the town and the CSU campus. 8 inches of rain fell in a 24
hour period. Assume that in a 24hour period there is a 1% chance of rainfall
exceeding 4.8 inches, and a 50% probability of rainfall exceeding 2.5 inches. If
rainfall depths have a GUMBEL distribution, what is the probability of actually
seeing 8.0 inches of rain or more in a 24 hour period. Conceptually, is a Gumbel
distribution a reasonable model for rainfall? 2. (12 points) A structural engineer is concerned with the possible wind force on
the side of a tall building. The total force F depends on the angle of the wind
which determines the effective area A, and the wind speed W, such that F =
A*W. Assume that A and W are independent random variables, and both are
lognormally distributed. If the mean and standard deviation of A are 50 and 15,
and W has median 120 and coefficient of variation 0.80, what are the mean and
standard deviation of the total force F on the structure? 3. (15 points) An engineer believes that a random variable X has a cumulative
distribution function Fx(x) = 1  2.718 exp[ — (kx+1)3 1 for 0 < x. You have a random sample of n independent observations { X1, X2, , Xn}. a) What nonlinear algebraic equation needs to be solved to find the
maximum likelihood estimator of k?
b) What are advantages and disadvantages of maximum likelihood
estimators relative to methodof—moments estimators?
c) Why would one ever use a BIASED estimator? Sounds stupid! 4. (8 points) Suppose that after the fall semester, you are asked to organize a
survey to determine how satisfied students were with CEE 304. You can get a
class list which has students' colleges and year. You do not have a lot of time, so
you do not want to try to contact every student. What might be a good sampling
plan? What biases could occur with a bad plan and why? How might the ideas
of stratified sampling be employed to reduce the variance of your estimator? CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam
Wed., December 16, 1998 (revised ‘01) 5. (14 points) Prof. Stedinger continues to work on Cryptosporidium oocyst
sampling problems. A lab tech is searching through a large sample of water to
find an oocyst. Assume that the concentration of oocysts is 10 per 100 liters of
water. (Use a Poisson process model in a—bc.) a) What is the mean and standard deviation of the number of oocysts she
would expect to ﬁnd in a 50 liter sample? b) What is the mean and standard deviation of the volume of water she
needs to search to find 20 oocysts? c) What is the probability of finding at least 1 oocyst in a 10 liter sample?
(1) Why might a Poisson process model be appropriate for this problem? 6. (4 pts) A charcoal filter removes chlorinated hydrocarbons from drinking
water. An environmental engineer collected 20 independent samples of water
that passed through such a filter. The average of the 20 observed removal rates
was i = 99.42% with sample standard deviation 3 = 0.16%. Assuming that the measurements are normally distributed about the true removal rate u, compute
a 95% confidence interval for u. 7. (8 pts) For the filter described in problem 6, the manufacturer claims that the
product should remove at least 99% of the chlorinated hydrocarbons. Use the
fact that in 14 of the 20 samples the calculated removal rate exceeded 99% to
evaluate the manufacturer's claim.
a) What are the appropriate hypotheses?
b) What is the rejection region for a 5% test?
c) What is the pvalue for the observed sample?
d) If the probability is 0.75 that the measured removal rate exceeds 99%, what is the probability the test in b) accepts the null hypothesis?
e) What is the probability the test in b) accepts the null hypothesis when it is true? 8. (14 pts) Several students are uncertain about the value of cramming before
exams. 14 students agreed to participate in an experiment. For their statics class,
they agree to sleep the night before one prelim, and before the other prelim they
stay up all night cramming. The normalized scores appear below for each
student and for the two prelims. a) What are the appropriate hypotheses for a Wilcoxon signedrank test? b) For a = 10%, what is the rejection region for the Wilcoxon signedrank test?
c) What is the pvalue for this sample using that test?
d) What p—value do you obtain with a onesample t test? e) With a = 5%, if the true mean of the differences were 12 points and the true
standard deviation were 20 points, what is the type II error be for a t test ? CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam
Wed., December 16, 1998 (revised ‘01) 1 40 48 8.0
2 92 67 25.0
3 64 42 22.0
4 64 84 ~20.0
5 55 22 33.0
6 95 79 16.0
7 72 75 ~30
8 57 48 9.0
9 82 59 23.0
10 79 65 14.0
11 80 65 15.0
12 76 27 49.0
13 64 96 32.0
14 ﬂ ZZ 41.9.
mean 70.57 57.50 13.07
standard deviation 14.87 22.67 22.68 9. (14 points) A ﬁrm has two different contractualnegotiating approaches. Costs to resolve the last 8 disagreements with the newer ﬂexible strategy were:
31, 67, 20, 10, 12, 26, 12, 23; avg = 25.23; standard deviation = 18.56. The cost to negotiate the last 8 disagreements with the traditional strategy were:
39, 34, 114, 34, 47, 27, 32, 29; avg = 44.28; standard deviation = 28.68.
These represent 16 different and independent cases. Is the new method better? For an appropriate nonparametric (NP) test:
(a) What are the appropriate null and alternative hypotheses?
(b) What is the rejection region for a NP test with a type I error of 5%.
(c) What is the pvalue for this data set? What do you conclude?
(d) What is the p—value for the corresponding twosample t test? 10. (35 pts) 1 took data I had collected in class on the weight (kg) and height (cm)
of twelve (n=12) students, which yielded: height = 170 cm weight = 305 kg
Sheight = 12.5 cm Sweight = 73 kg
2 (weighti  weight )(heighti  height ) = 6190 a) What are the leastsquares estimates of a and [3 for the model
weight = a + 8 (height) + (error) ? CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam
Wed., December 16, 1998 (revised ‘01) b) Compute an unbiased estimate of the variance of the model's residual error?
c) What is the standard deviation of the estimator of B? d) What is the rejection region for a test with on = 5% of whether or not 5 0?
What is the pvalue for this case?
Is onesided or twosided test be employed? WHY? e) Using your fitted line, what is your estimate of the mean weight of a people
who are 183 cm tall (6 feet tall)? Given the data provided, what is a 95%
confidence interval for the MEAN weight of people who are 183 cm tall? f) What fraction of the variability in the original weight data
is explained by the model? 11. (10 points) The holiday season brings extra magazines in the mail.
Here are the number of pages in magazines that recently arrived at my home: 70, 88, 124, 103, 155, 46, 16, 22, 98, 17, 98, 65, 40, 118, 210, 80
[TC3, Lands End, Seed Catalog, Newsweek, Penny’s, Environment, Western Water, Adriondack Council, Newsweek, Invention 8 Technology, Succesories, Civil Engineering, American Scientist, Lands End) Might the number of pages follow a normal distribution?
(a) Plot these data on attached probability paper.
(b) From the plot what do you think about the normality of magazines? WHY?
(c) For the probabilityplot correlation test of normality, specify the approximate rejection region for an a = 5% test. (d) The r for the probability plot correlation test of the data with their nscores was
0.972 The r for the probability plot correlation test of the logarithms of the data with their nscores was 0.963. What would you conclude from these two results? 12. (3+4 pts) (a) Why might statistics appropriately be considered the science of decision making (under uncertainty)? (b) Why is the concept of an estimator important? Knowing what you are dairy is as important as (nativity ﬂow to do it. I f you Jon’t (now wﬁatyou are doing, how do you know you did it r'g/it?
Jery Stedinger CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions FInaI Exam 1998 (revised ‘01) 1. Gumbel distribution: okay, do we undestand how to manipulate a (DP to compute values of parameters? XI, = u  ln[—1n(p)l/ a. 1 pt
Have two equations for p = 0.5 and 0.99. Yields a = 1.84; 11 = 2.30. 5 pts
Thus F(8.0) = exp[ exp{ a(8.0  u) }] = 0.999972; Pr[ R28 ] = 0.00003 1 pts
Gumbel is reasonable choice because considering LARGEST rainfall event in the
year, and thus in concept the largest of many storms that occurred. 1 pt 2. Lognormal: do not needs all of logspace moments, but can calculate them.
621% = ln[1+(sA/ 0,92] = 0.0862; uA= 50 > MA = ln(50)0.5*021nA = 3.87 3 pts
median w = 120 => ulnw = ln(120) = 4.787; 02an = ln[1+(0.8)2] = 0.495 3 pts
1111: ~ Normal( ulnA+ ulnw, 621% + 6211M) = N[ 8.656, 0.581 1 2 pts
up = exp(ulnF + 0.5 olnlgz) = 7684 621; = [01:12 { exp(olnF2) — 1 }= 46,499,904 = (6819)2 StDev(F) = 6819 2 pts 3. Estimators. First compute pdf f(x) = 2.718 (3) k (kx+1)2 exp[ — (kx+1)3 ] 2 pt
Then the likelihood function for n observations is L(k) = n ln(3*2.718) + n ln(k) + 2 21 In (kxi+1) — 21 (kxi+1)3 2 pt
Taking the derivative w.r.t k yields
0 = n/k + 2 2i xi/ (1min) — 2i 3 xi (kxi+1)2 Solve for k. 2 pt (b) In large samples MLEs should be more efﬁcient (small variance and bias) and
has strong theoretical motivation. Unfortunately MLEs are often more difficult
to compute than method of moment estimators. In small samples, moment
estimators are often as good or better, as well as often being simplier. (5 pts) (c) Bias is only one type of error. Variance describes another. Thus one might
consider estimators that have minimum mean square error, which is a
combination of the two and describes the total error. Also for complex problems,
unbiased estimators are often not available, so if the bias is smalled compared to the standard error (Bias2 << Var) , it does not really matter much overall. (4 pts) 4. Need to avoid bias. Could use random numbers to select a representative
random sample or subset of students in the class from the class list. Could also
take every fifth in list if we believe the ordering on the list is not related to
satisfaction in the course. Need to make sure we actually get a response to every
student selected! Otherwise nonresponse bias could be important. Do not rely
on volunteers, they may as a group have an axe to grind. If seniors / juniors have
different satisfaction levels (or students by grade: A, B, s C; men 6: women; or
college), then could define such catagories to be different strata, and consider
weighted means of responses from students in different strata... Weights to
reﬂect strata size. This can reduce variance of estimated mean response. (8 pts) CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutlons Final Exam 1998 (revlsed ‘01) 5 7» = 10/ 100 liters.= 0.1 per liter. 1 pt
(a) v = 7» V = 5 for V = 50 liters. For Poisson distribution,
a = v = 5; standard deviation = sqrt(n) = 2.23. 3 pts
(b) Gamma distribution. E[volume to find 20] = 20/ A = 200 liters. 4 pts
Var[volume to find 20] = 20m.2 = 2000 litersz = (45 liters) > so = 45 L.
(c) Pr[K21 lv 1] = 1—0.368=0.63. 3pts (d) If oocysts are located INDEPENDENTLY throughout the water sample, the
density/ concentration is constant, and if TWO do not happen together (they do
not stick together)  then the assumptions of a Poisson process are met. 3 pts 6 t0025,19 = 2093 (1 Pt): 95% CI = X‘bar i S t0025,19/ sqrt(n) = 99.35 to 99.49. (3 pts) 7. Some students incorrectly used a ttest; t = sqrt(20) (99.4299)/ s = 11.74.
SIGNIFICANT, but the wrong test to use. NEED sign test to use counts.
a) Ho: p(+) = p() = 0.5; Ha: p(+) > 0.5 ~ b) Reject when one sees 2 15 plus signs (3 5 minus signs) 2 pt
c) p—value for 14 pluses (6 minuses) is 5.8%. 2 pt
d) If p = 0.75; Pr(S s 14 l p = 0.75) = 38.3% (Table A.1 with n = 20) 2 pt
e) Probability of accepting Ho when it is true is 97.9% (NOT exactly 95% because the actual type I error was not 5%!) 2 pt 8. Paired Data . (a) Ho: F5 = FC where S are sleep data, and Y are cram data.
Ha: F5 stochastically differenth. 2 pts Alternatively Ho: medians = medianc versus Ha: medians ¢ medianC (b) Two sided test: reject 5+ 2 79 or s 26. 79 + 26 = 105 = 14*15/ 2 total. 2 pts (c) Let 5+ = 84. Pr[ 3 2 84] = 2.5%; pvalue =5% 4 pts
(c) 1—sample t value = 2.17 with df = 13 hence pvalue little less than 5%. 4 pts
(d) d = 0.6; df = 13, beta ~ 55% 2 pts 9. a) Does the new method result in lower costs? These are 16 DIFFERENT cases. H0: FN = PT No difference 2 pts
Ha: FN > FT Costs under New are stochastically less than Traditional approach. Data provides two independent samples, one with Traditional, and one with New method. b) Use Wilcoxon—Mann—Whitey Rank Sum test 2 pts
W = sum of ranks of traditional (bigger) results, (a: 5%) reject if W 2 84 2 pts
c) Observe W = 91. pvalue a little less than 1%. 4 pts d) 2sample t = 1.58, compute df = 11.99 > 12. pvalue 7% (MS Excel). 4 pts
For a = 5% test, tobs < — t0.05,12 =  1.782, so must Accept H0 (cannot reject Ho). But why should the data be normal? Look at that 32 value! CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
SolutIons Flnal Exam 1998 (revised ‘01) 10. a) a =—307.2473; b = 3.6015 5 pts
b) Se2 = 3633 = (60.27)2; se = 60.27 5 pts
c) StDev(b) = 1.452 5 pts
d) Expect weight to increase with height > Ho: B = 0 Reject if T > 150.0510 = 1.812 10 pts For this data obtain, t = 2.477 so for df = 10, obtain p—value = 0.02 (about).
Onesided: we expect weight to increase with height. e) For 183 cm height. Prediction for y is 352 kg lbs. 5 pts
A 95% CI for this MEAN value (not a future observation) is: a + bxi t0025,10 Se SD of mean is 25.7, so for to.025,10 = 2.228 obtain a 95% confidence interval for
the mean of 295 to 409 kg. WHOW, for the mean! 0112: km _ _  2 _ 2 =
(Total sumofsquares) _ 1 (n lose I [(n 1)sy ] 0380 5 Pts 11. Probability plot can be obtained by plotting x6) as a function of <I>'1(pi), or
equivalently using normal probability paper and plotting xm as a function of pi. MINITAB and DataDesk can do it; I used EXCEL, even though plot is not as nice. a) Rank pi = (i3/8)/ (n+1 /4) <I>'1(pi) x“)
1 0.038 1.769 16
2 0.100 1.282 17
3 0.162 0.988 22
4 0.223 0.762 40
5 0.285 0.569 46
6 0.346 0.396 65
7 0.408 0233 70
8 0.469 0077 80
9 0.531 0.077 88
10 0.592 0.233 98
11 0.654 0.396 98
12 0.715 0.569 103
13 0.777 0.762 118
14 0.838 0.988 124
15 0.900 1.282 155
16 0.962 1.769 210 CEE 304  UNCERTAINTY ANALYSIS IN ENGINEERING
Solutions FlnaI Exam 1998 (revised '01) 5 pts
(b) Does not look really straight; maybe bends upwards? 2 pts
(c) For n = 15 reject if r < 0.94 for 5% test. 1 pt
(d) Well, cannot reject normality or lognormality. However,normal gives a
much better fit and I would recommend. 2 pts 12. (a) Statistics is the science of decision making under uncertainty because
hypothesis testing codifies a systematic procedure for choOsing between two (or
more) hypothesis when data does not precisely determine what is true. Statistics
addresses how to set up alternate scientific hypothesis and use data to determine
which is likely to be correct given that data that is available, and perhaps if either
can be rejected with the available evidence. This is the basis of the scientific
method. (4 pts) (b) The concept of an estimator is very important because an estimator is an
explicit definition of how we will guess the value of a parameter given the
available data. With such an explicit definition we can study the sampling
properties of such guessing procedures: bias, variance, and other measures of
accuracy. This allows alternative estimators to be compared, and poor estimation
schemes to be rejected. By using conﬁdence interval estimators the precision
(uncertainty) of the estimators is made explicit. (4 pts) Estimators allow us to, in some instances, develop accurate estimates of
characteristics of a population, without observing or sampling every possible
outcome. ...
View
Full
Document
 Fall '08
 Stedinger

Click to edit the document details