1998_Final_Exam_Solutions - CEE 304 - UNCERTAINTY ANALYSIS...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam Wed., December 16, 1998 (revised '01) Exam is open notes and open—book. It lasts 150 minutes and there are 150 points. SHOW WORK! 1. (8 points) On July 28th, 1997, torrential rains pounded Fort Collins, Colorado, devastating parts of the town and the CSU campus. 8 inches of rain fell in a 24- hour period. Assume that in a 24-hour period there is a 1% chance of rainfall exceeding 4.8 inches, and a 50% probability of rainfall exceeding 2.5 inches. If rainfall depths have a GUMBEL distribution, what is the probability of actually seeing 8.0 inches of rain or more in a 24 hour period. Conceptually, is a Gumbel distribution a reasonable model for rainfall? 2. (12 points) A structural engineer is concerned with the possible wind force on the side of a tall building. The total force F depends on the angle of the wind which determines the effective area A, and the wind speed W, such that F = A*W. Assume that A and W are independent random variables, and both are lognormally distributed. If the mean and standard deviation of A are 50 and 15, and W has median 120 and coefficient of variation 0.80, what are the mean and standard deviation of the total force F on the structure? 3. (15 points) An engineer believes that a random variable X has a cumulative distribution function Fx(x) = 1 - 2.718 exp[ — (kx+1)3 1 for 0 < x. You have a random sample of n independent observations { X1, X2, , Xn}. a) What nonlinear algebraic equation needs to be solved to find the maximum likelihood estimator of k? b) What are advantages and disadvantages of maximum likelihood estimators relative to method-of—moments estimators? c) Why would one ever use a BIASED estimator? Sounds stupid! 4. (8 points) Suppose that after the fall semester, you are asked to organize a survey to determine how satisfied students were with CEE 304. You can get a class list which has students' colleges and year. You do not have a lot of time, so you do not want to try to contact every student. What might be a good sampling plan? What biases could occur with a bad plan and why? How might the ideas of stratified sampling be employed to reduce the variance of your estimator? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam Wed., December 16, 1998 (revised ‘01) 5. (14 points) Prof. Stedinger continues to work on Cryptosporidium oocyst sampling problems. A lab tech is searching through a large sample of water to find an oocyst. Assume that the concentration of oocysts is 10 per 100 liters of water. (Use a Poisson process model in a—b-c.) a) What is the mean and standard deviation of the number of oocysts she would expect to find in a 50 liter sample? b) What is the mean and standard deviation of the volume of water she needs to search to find 20 oocysts? c) What is the probability of finding at least 1 oocyst in a 10 liter sample? (1) Why might a Poisson process model be appropriate for this problem? 6. (4 pts) A charcoal filter removes chlorinated hydrocarbons from drinking water. An environmental engineer collected 20 independent samples of water that passed through such a filter. The average of the 20 observed removal rates was i = 99.42% with sample standard deviation 3 = 0.16%. Assuming that the measurements are normally distributed about the true removal rate u, compute a 95% confidence interval for u. 7. (8 pts) For the filter described in problem 6, the manufacturer claims that the product should remove at least 99% of the chlorinated hydrocarbons. Use the fact that in 14 of the 20 samples the calculated removal rate exceeded 99% to evaluate the manufacturer's claim. a) What are the appropriate hypotheses? b) What is the rejection region for a 5% test? c) What is the p-value for the observed sample? d) If the probability is 0.75 that the measured removal rate exceeds 99%, what is the probability the test in b) accepts the null hypothesis? e) What is the probability the test in b) accepts the null hypothesis when it is true? 8. (14 pts) Several students are uncertain about the value of cramming before exams. 14 students agreed to participate in an experiment. For their statics class, they agree to sleep the night before one prelim, and before the other prelim they stay up all night cramming. The normalized scores appear below for each student and for the two prelims. a) What are the appropriate hypotheses for a Wilcoxon signed-rank test? b) For a = 10%, what is the rejection region for the Wilcoxon signed-rank test? c) What is the p-value for this sample using that test? d) What p—value do you obtain with a one-sample t test? e) With a = 5%, if the true mean of the differences were 12 points and the true standard deviation were 20 points, what is the type II error be for a t test ? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam Wed., December 16, 1998 (revised ‘01) 1 40 48 -8.0 2 92 67 25.0 3 64 42 22.0 4 64 84 ~20.0 5 55 22 33.0 6 95 79 16.0 7 72 75 ~30 8 57 48 9.0 9 82 59 23.0 10 79 65 14.0 11 80 65 15.0 12 76 27 49.0 13 64 96 -32.0 14 fl ZZ 41.9. mean 70.57 57.50 13.07 standard deviation 14.87 22.67 22.68 9. (14 points) A firm has two different contractual-negotiating approaches. Costs to resolve the last 8 disagreements with the newer flexible strategy were: 31, 67, 20, 10, 12, 26, 12, 23; avg = 25.23; standard deviation = 18.56. The cost to negotiate the last 8 disagreements with the traditional strategy were: 39, 34, 114, 34, 47, 27, 32, 29; avg = 44.28; standard deviation = 28.68. These represent 16 different and independent cases. Is the new method better? For an appropriate non-parametric (NP) test: (a) What are the appropriate null and alternative hypotheses? (b) What is the rejection region for a NP test with a type I error of 5%. (c) What is the p-value for this data set? What do you conclude? (d) What is the p—value for the corresponding two-sample t test? 10. (35 pts) 1 took data I had collected in class on the weight (kg) and height (cm) of twelve (n=12) students, which yielded: height = 170 cm weight = 305 kg Sheight = 12.5 cm Sweight = 73 kg 2 (weighti - weight )(heighti - height ) = 6190 a) What are the least-squares estimates of a and [3 for the model weight = a + 8 (height) + (error) ? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1998 Final Exam Wed., December 16, 1998 (revised ‘01) b) Compute an unbiased estimate of the variance of the model's residual error? c) What is the standard deviation of the estimator of B? d) What is the rejection region for a test with on = 5% of whether or not 5 -0? What is the p-value for this case? Is one-sided or two-sided test be employed? WHY? e) Using your fitted line, what is your estimate of the mean weight of a people who are 183 cm tall (6 feet tall)? Given the data provided, what is a 95% confidence interval for the MEAN weight of people who are 183 cm tall? f) What fraction of the variability in the original weight data is explained by the model? 11. (10 points) The holiday season brings extra magazines in the mail. Here are the number of pages in magazines that recently arrived at my home: 70, 88, 124, 103, 155, 46, 16, 22, 98, 17, 98, 65, 40, 118, 210, 80 [TC3, Lands End, Seed Catalog, Newsweek, Penny’s, Environment, Western Water, Adriondack Council, Newsweek, Invention 8 Technology, Succesories, Civil Engineering, American Scientist, Lands End) Might the number of pages follow a normal distribution? (a) Plot these data on attached probability paper. (b) From the plot what do you think about the normality of magazines? WHY? (c) For the probability-plot correlation test of normality, specify the approximate rejection region for an a = 5% test. (d) The r for the probability plot correlation test of the data with their nscores was 0.972 The r for the probability plot correlation test of the logarithms of the data with their nscores was 0.963. What would you conclude from these two results? 12. (3+4 pts) (a) Why might statistics appropriately be considered the science of decision making (under uncertainty)? (b) Why is the concept of an estimator important? Knowing what you are dairy is as important as (nativity flow to do it. I f you Jon’t (now wfiatyou are doing, how do you know you did it r'g/it? Jery Stedinger CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions FInaI Exam 1998 (revised ‘01) 1. Gumbel distribution: okay, do we undestand how to manipulate a (DP to compute values of parameters? XI, = u - ln[—1n(p)l/ a. 1 pt Have two equations for p = 0.5 and 0.99. Yields a = 1.84; 11 = 2.30. 5 pts Thus F(8.0) = exp[ -exp{ -a(8.0 - u) }] = 0.999972; Pr[ R28 ] = 0.00003 1 pts Gumbel is reasonable choice because considering LARGEST rainfall event in the year, and thus in concept the largest of many storms that occurred. 1 pt 2. Lognormal: do not needs all of log-space moments, but can calculate them. 621% = ln[1+(sA/ 0,92] = 0.0862; uA= 50 -> MA = ln(50)-0.5*021nA = 3.87 3 pts median w = 120 => ulnw = ln(120) = 4.787; 02an = ln[1+(0.8)2] = 0.495 3 pts 1111: ~ Normal( ulnA+ ulnw, 621% + 6211M) = N[ 8.656, 0.581 1 2 pts up = exp(ulnF + 0.5 olnlgz) = 7684 621; = [01:12 { exp(olnF2) — 1 }= 46,499,904 = (6819)2 StDev(F) = 6819 2 pts 3. Estimators. First compute pdf f(x) = 2.718 (3) k (kx+1)2 exp[ — (kx+1)3 ] 2 pt Then the likelihood function for n observations is L(k) = n ln(3*2.718) + n ln(k) + 2 21 In (kxi+1) -— 21 (kxi+1)3 2 pt Taking the derivative w.r.t k yields 0 = n/k + 2 2i xi/ (1min) — 2i 3 xi (kxi+1)2 Solve for k. 2 pt (b) In large samples MLEs should be more efficient (small variance and bias) and has strong theoretical motivation. Unfortunately MLEs are often more difficult to compute than method of moment estimators. In small samples, moment estimators are often as good or better, as well as often being simplier. (5 pts) (c) Bias is only one type of error. Variance describes another. Thus one might consider estimators that have minimum mean square error, which is a combination of the two and describes the total error. Also for complex problems, unbiased estimators are often not available, so if the bias is smalled compared to the standard error (Bias2 << Var) , it does not really matter much overall. (4 pts) 4. Need to avoid bias. Could use random numbers to select a representative random sample or subset of students in the class from the class list. Could also take every fifth in list if we believe the ordering on the list is not related to satisfaction in the course. Need to make sure we actually get a response to every student selected! Otherwise non-response bias could be important. Do not rely on volunteers, they may as a group have an axe to grind. If seniors / juniors have different satisfaction levels (or students by grade: A, B, s C; men 6: women; or college), then could define such catagories to be different strata, and consider weighted means of responses from students in different strata... Weights to reflect strata size. This can reduce variance of estimated mean response. (8 pts) CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutlons Final Exam 1998 (revlsed ‘01) 5 7» = 10/ 100 liters.= 0.1 per liter. 1 pt (a) v = 7» V = 5 for V = 50 liters. For Poisson distribution, a = v = 5; standard deviation = sqrt(n) = 2.23. 3 pts (b) Gamma distribution. E[volume to find 20] = 20/ A = 200 liters. 4 pts Var[volume to find 20] = 20m.2 = 2000 litersz = (45 liters) --> so = 45 L. (c) Pr[K21 lv -1] = 1—0.368=0.63. 3pts (d) If oocysts are located INDEPENDENTLY throughout the water sample, the density/ concentration is constant, and if TWO do not happen together (they do not stick together) -- then the assumptions of a Poisson process are met. 3 pts 6- t0025,19 = 2-093 (1 Pt): 95% CI = X‘bar i S t0025,19/ sqrt(n) = 99.35 to 99.49. (3 pts) 7. Some students incorrectly used a t-test; t = sqrt(20) (99.42-99)/ s = 11.74. SIGNIFICANT, but the wrong test to use. NEED sign test to use counts. a) Ho: p(+) = p(-) = 0.5; Ha: p(+) > 0.5 ~- b) Reject when one sees 2 15 plus signs (3 5 minus signs) 2 pt c) p—value for 14 pluses (6 minuses) is 5.8%. 2 pt d) If p = 0.75; Pr(S s 14 l p = 0.75) = 38.3% (Table A.1 with n = 20) 2 pt e) Probability of accepting Ho when it is true is 97.9% (NOT exactly 95% because the actual type I error was not 5%!) 2 pt 8. Paired Data . (a) Ho: F5 = FC where S are sleep data, and Y are cram data. Ha: F5 stochastically differenth. 2 pts Alternatively Ho: medians = medianc versus Ha: medians ¢ medianC (b) Two sided test: reject 5+ 2 79 or s 26. 79 + 26 = 105 = 14*15/ 2 total. 2 pts (c) Let 5+ = 84. Pr[ 3 2 84] = 2.5%; p-value =5% 4 pts (c) 1—sample t value = 2.17 with df = 13 hence p-value little less than 5%. 4 pts (d) d = 0.6; df = 13, beta ~ 55% 2 pts 9. a) Does the new method result in lower costs? These are 16 DIFFERENT cases. H0: FN = PT No difference 2 pts Ha: FN > FT Costs under New are stochastically less than Traditional approach. Data provides two independent samples, one with Traditional, and one with New method. b) Use Wilcoxon—Mann—Whitey Rank Sum test 2 pts W = sum of ranks of traditional (bigger) results, (a: 5%) reject if W 2 84 2 pts c) Observe W = 91. p-value a little less than 1%. 4 pts d) 2-sample t = -1.58, compute df = 11.99 -> 12. p-value 7% (MS Excel). 4 pts For a = 5% test, tobs < — t0.05,12 = - 1.782, so must Accept H0 (cannot reject Ho). But why should the data be normal? Look at that -32 value! CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING SolutIons Flnal Exam 1998 (revised ‘01) 10. a) a =—307.2473; b = 3.6015 5 pts b) Se2 = 3633 = (60.27)2; se = 60.27 5 pts c) StDev(b) = 1.452 5 pts d) Expect weight to increase with height -> Ho: B = 0 Reject if T > 150.0510 = 1.812 10 pts For this data obtain, t = 2.477 so for df = 10, obtain p—value = 0.02 (about). One-sided: we expect weight to increase with height. e) For 183 cm height. Prediction for y is 352 kg lbs. 5 pts A 95% CI for this MEAN value (not a future observation) is: a + bxi t0025,10 Se SD of mean is 25.7, so for to.025,10 = 2.228 obtain a 95% confidence interval for the mean of 295 to 409 kg. WHOW, for the mean! 0112: km _ _ - 2 _ 2 = (Total sum-of-squares) _ 1 (n lose I [(n 1)sy ] 0380 5 Pts 11. Probability plot can be obtained by plotting x6) as a function of <I>'1(pi), or equivalently using normal probability paper and plotting xm as a function of pi. MINITAB and DataDesk can do it; I used EXCEL, even though plot is not as nice. a) Rank pi = (i-3/8)/ (n+1 /4) <I>'1(pi) x“) 1 0.038 -1.769 16 2 0.100 -1.282 17 3 0.162 -0.988 22 4 0.223 -0.762 40 5 0.285 -0.569 46 6 0.346 -0.396 65 7 0.408 -0233 70 8 0.469 -0077 80 9 0.531 0.077 88 10 0.592 0.233 98 11 0.654 0.396 98 12 0.715 0.569 103 13 0.777 0.762 118 14 0.838 0.988 124 15 0.900 1.282 155 16 0.962 1.769 210 CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions FlnaI Exam 1998 (revised '01) 5 pts (b) Does not look really straight; maybe bends upwards? 2 pts (c) For n = 15 reject if r < 0.94 for 5% test. 1 pt (d) Well, cannot reject normality or lognormality. However,normal gives a much better fit and I would recommend. 2 pts 12. (a) Statistics is the science of decision making under uncertainty because hypothesis testing codifies a systematic procedure for choOsing between two (or more) hypothesis when data does not precisely determine what is true. Statistics addresses how to set up alternate scientific hypothesis and use data to determine which is likely to be correct given that data that is available, and perhaps if either can be rejected with the available evidence. This is the basis of the scientific method. (4 pts) (b) The concept of an estimator is very important because an estimator is an explicit definition of how we will guess the value of a parameter given the available data. With such an explicit definition we can study the sampling properties of such guessing procedures: bias, variance, and other measures of accuracy. This allows alternative estimators to be compared, and poor estimation schemes to be rejected. By using confidence interval estimators the precision (uncertainty) of the estimators is made explicit. (4 pts) Estimators allow us to, in some instances, develop accurate estimates of characteristics of a population, without observing or sampling every possible outcome. ...
View Full Document

Page1 / 8

1998_Final_Exam_Solutions - CEE 304 - UNCERTAINTY ANALYSIS...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online