200020Final20Exam_Solutions

200020Final20Exam_Solutions - CEE 304 - UNCERTAINTY...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2000 Final Exam December 13, 2000 Exam is open notes and open book. It lasts 150 minutes and there are- 150 points. SHOW WORK! 1. (7 points) Central Limit Theorem (a) What is the Central Limit Theorem (CLT)? (b) Why is it thought to be important? (c) Provide one good example wherein the CLT can appropriately be applied in the field of civil, biological, or environmental engineering? 2. (15 points) Last week in Florida, an election inspector was running ballot cards through a mechanical counting machine. The machine counts 25,000 cards an hour. The machine on average rejects 3 ballots a minute because of a poorly punched chad, bad card alignment, or another problem. a) Why might a Poisson process model be appropriate for describing the arrival in time of rejected ballots? b) What are the mean and the standard deviation of the number of ballots rejected in an 8 hour shift? c) What is the probability in the next minute no cards are rejected? d) What is the mean and standard deviation of the time that would pass before 100 ballots are rejected? e) If the probability a ballot is marked for Gore is 50%, what is the probability exactly 3 of the 5 ballots just rejected are marked for Gore? 3. (15 points) An environmental engineer is modelling the flux of phosphorous into a lake. As an initial approximation the engineer assumes that the annual inflow Q is lognormally distributed with mean 1 million cubic meters and standard deviation 0.22 million cubic meters. [Thus the mean and standard deviation of ln(Q) are 13.792 and 0.217.] The average concentration C in the inflow is lognormal with mean 25 mg/[ with a coefficient of variation of 40%. The flux F equals C*Q, and assuming C and Q are independent, what are the median, mean and variance of the flux F? (Note 1000 liter = 1 cubic meter.) 4. (15 points) Let Oy be the maximum 8-hour—average ozone concentration observed in year y. (a) Why is a Gumbel model a reasonable model for this random variable? (b) If the mean for Oy for a particular city is 25 ug/ m3 with a coefficient of variation equal to 35%, what maximum concentration is exceeded with a probability of only 5% in any year? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2000 Final Exam December 13, 2000 5. (5 points) Sam, a structures student on a summer job, got into an argument about the weight of narrow cinder blocks. To resolve the conflict he measured the weight of 16 blocks and found their sample average was 31.8 lbs, with a standard deviation of 0.7 lbs. Construct a 95% confidence interval for the true mean weight of cinder blocks. The foreman wants to know if the true mean weight is in the interval you just computed. What should you tell him? 6. (25 points) The CEE and ABEN students in CEE 304 got into an argument about who was more cultured. They decided to resolve the issue by finding out how many non—technical books students had read over the last two summers. The answers came out to be too discrete, so they instead decided to consider the total number of pages in non—technical books CEE and ABEN students had read. (This allowed for students who had not read a whole book.) Use a non-parametric test to resolve the debate with the (hypothetical) data below for students who could be contacted. (a) What are the appropriate null and alternative hypotheses? (b) What is the rejection region with a type I error of 10%. (c) What is the p-value for this data set? What do you conclude? ABEN CEE 1 579 673 2 1330 76 3 754 366 4 2069 1062 5 465 485 6 577 609 7 613 369 8 2810 154 9 1976 10 280 mean 1149 605 sd 862 559 (d) What are the appropriate null and alternative hypotheses for a t test? (e) What is the rejection region with a type I error of 10% for a t test? (f) What is the p-value for this data set for the t test? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2000 Final Exam December 13, 2000 7. (35 pts) Prof. Stedinger is involved with the College‘s Student Experience Committee. They are proposing a new course evaluation scheme. Suppose they tested their new courses questionnaire by using both the old questionnaire and the new questionnaire in the classes of 12 prOfessors. Here are the results: Professor Old New Difference (Old-New) 1 Borum 2.62 2.71 —0.09 2 Spock 4.55 4.49 0.06 3 Minus 2.25 2.26 -0.01 4 Bitterman 2.93 3.03 —0.10 5 Hanson 3.63 3.79 -0.17 6 Janus 3.23 3.27 -0.04 7 Maxims 2.58 2.79 —0.22 8 Passum 4.39 4.57 -0.17 9 Brighten 4.79 4.60 0.19 10 Stedinger 2.56 2.80 —0.25 11 Lion 4.70 4.73 -—0.03 12 Failum 2.38 2.92 —0.54 average 3.38 3.50 —-0.11 standard deviation 0.98 0.89 0.18 Before the text was conducted, the committee suspected faculty would get higher scores on the new evaluation. Test this hypothesis. a) What are the appropriate hypotheses for a nonparametric Wilcoxon test? b) For a = 5%, what is the rejection region for the Wilcoxon test? c) What is the p—value for this sample using that test? d) What are the appropriate hypotheses for a sign test? e) What is the p-value for this sample using a sign test? f) What are the appropriate hypotheses for a t test? g) What p-value do you obtain? h) With a = 5% and a sample size of 15, if the true mean of (Old - New) were -0.12, and true stand. deviation were 0.2, what would be the t—test type II error? i) Which test would you recommend for use this this data? WHY? j) In parts (c), (e) and (g) you computed p-values. What information do p-values convey that make them useful to report in scientific writing? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING ’ 2000 Final Exam December 13, 2000 8. (30 pts) The geotechnical laboratory is concerned with the settlement likely to take place over a 10-year period after construction. They have development a laboratory soil-compression test which produces a condensation index C that is thought to predict 10-year settlement S. Here are 26 observations from Japan: Si: 0.76 0.95 2.87 15.4 14.76 cm Ci: 0.25 0.35 0.86 3.60 3.73 cm Si Ci average 9.32 2.35 standard dev. 3.91 0.94 2(51- S) (Ci - E) = 90.78 cm2 a) If one fit a linear predictive model for settlement S as a function of the condensation index C: S = (1 + 5 C + 8 ? what would be the least-squares estimators of the two coefficients? Here 8 represents the unexplained error for this model. b) What is an unbiased estimator of the variance of the errors 8? c) Extensive testing in the US found that [3 = 3.7; construct a test with a type I error of 1% to see if the new Japanese data is consistent with the US results. d) The settlement has not yet been measured at the Tamaka residence, but the value of C there is 3.1 cm. Provide an 95% predictive interval for the-value of S that will be measured at that location. 9. (6 points) Students have been casting concrete cylinders to test the strength of a particular mix. Here are the results from 12 samples. 5610, 4440, 3430, 5760, 6320, 5640, 2025, 4275, 4005, 6530, 5240, 4120 (a) Plot these data on the attached probability paper. (b) From the plot what do you think about the normality of the data? WHY? (Prof. Stedinger found that r = 0.974) Have a happy holiday. And please, strive to become a type II engineer. CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions Final Exam 2000 1. Central Limit Theorem. a) States that the sum or average of n independent and identically distributed random variables (random sample) has a normal distribution as n becomes large. b) It means that a normal model should be a good description of many phenomona that results from the additive effect of many independent factors. Also means that regardless of the original distribution of the data, the sample average for large n will be normally distributed allowing the development of statistical tests. NEAT! c) Many answers. Example: annual flows for the year (sum many days), average annual temperature (avg. many days), annual rainfall (total many storms), strength of a cable which reflects the strength of many strands. 2 . Poisson process. Compute A = 3 arrivals per minute. a) Many possible failures but only a few happen. They arrive singlely (one at a time), independently in time, with constant rate. 2 pts b) For Poisson dist, mean = u = v = M = 3*8*60 = 1440; SD = sqrt(v) = 38 3 pts c) Pr{ K: 0 Iv =3] = exp [—3 ] = 0.0498 ~ 5% 2 pts d) Gamma distribution (not required). k = 100; A = 3. E(T100) = 100/ 7» = 33.33 minutes; Var(T100) =100/A2 = 11.11 —> SD = 3.33 minutes 4 pts e) Binomial distribution: (5 choose 2)*(0.5)"5 = 10(0.03125) = 0.3125 4 pts 3. Lognormal model for products: need the log—space mean and variance which we can then add. We are given: (32an = 0.0473 11an = 13.792 2 pts CC = 04*25 = 10 mg/l => ozlnc = ln[1+(oC/ “(3)21 =o.143 = (0.385)2 1+2 pts “Inc =1n(25)—0.5*021nc = 3.145 2 pts lnF ~ Normal(uan + “Inc , 021nQ+021nC ) = N[ 16.937, 0.1957] 2 pts “F = exp(ulnF + 0.5 01,1122) = 25,000,000 mg*m"3/l = 25 kg 2 pt (:21: = [01:12 { exp(olnF2) — 1 }= 1.35 E+14; SD = 11,600,000 = 11.6 kg 2 pt median = exp(u1nF) = 2,700,000 = 22.7 kg 2 pt One can also compute the mean and variance by direct moment relationships. 4. Gumbel distribution: models of extremes (a) Gumbel is reasonable choice because considering maximum concentration of all the many many 8-hour averages in a year. Maximums in different weeks likely to be independent. 5 pts. CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions Final Exam 2000 (b) " a2 = 1.645 /Var(O) = 0.0215 = (0.1466)2 3 Pts 11 = E[O] — 0.57721/a = 21.062 3 pts For p = 0.95 use XI, = 11 ~ ln[-1n(p)] / a = 41.33 ug/l 4 pts (Not p = 0.05, as some students employed.) 5. Confidence Intervals. (a) t0.025.15 = 2.131 1 Pt 95% CI = x-bar i S t0.025,15/sqrt(n) = 31.4 to 32.2 lbs. 2 pts (b) That this is a 95% confidence means that when intervals are constructed in this fashion, on average, 95% of those intervals (i.e. most) will ACTUALLY contain the value of the true mean. This particular interval may or may not contain the mean; we do not know. 2 pts 6. Two-Sample Tests a) From Prof. Stedinger’s View point: H0: FABEN = FCEE NO difference 2 pts Ha: F ABEN a: FCEE Some difference, and we will see which way it goes. b) Use Wilcoxon-Mann-Whitey Rank Sum test W = sum ranks of ABEN uw = 76; SDW = 11.3; a = 10% test, reject Ho if Z = IW—76 I /11.3 > z0.05 = 1.645 4 pts c) Observe W = 6+8+9+11+13+15+17+18 = 97. Z = 1.87. p«va1ue = Pr( I Z! > 1.87] = 2 (0.031) = 6%. 6 pts d) H0: uABEN = 113.3 No difference 2 pts Ha: uABEN :t uCEE And the results will tell us who is more cultured. e) Need to estimate degrees of freedom. 11 = 11.49 or approximately = 11 reject Ho ifT = I W I >t0.05,11=1.796 5pts 2 4/_S_i+§z nx 11,, Old exams had pooled t test, but it is thought to be an unreliable test. If in reality variances are very different, then one may not get low Type II error they should. f) Obtain t = 1.55; P-value = 2*(0.075) = 15% (Table A.8 with t = 1.55 8: df = 11) 6 pts 7. Paired Data => One-Sample Tests a) Use Wilcoxon Signed Rank Test on differences. H0: median D = 0 Ha: median D < 0 (expect Old - New < 0) 2 pts (b) Sum ranks of the negative differences where big values have large ranks. Reject l-Io if S, 2 61 (from tables; 5_ = 60 is close — on = 5.5%) 3 pts (c) Find 5_ = 65. p-value = Pr[ 5_ z 64] ~ 2% (64 yields 26%) 5 pts One can also sum positive differences, and can assign large/ small ranks to large/ small values. Just need to keep track of what tail you are interested in under Ha. Sum all ranks is 78. CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutlons Flnal Exam 2000 d) For C = 3.1 the mean value of S = a + bC = 12.40; a 95% prediction interval for a future S—value is 11.1 to 13.72, using t0.025,24 = 2.064 (w/ SE-prediction = 0.64) in a + bxi t0'025124 38 n n————-—~ 8pts 2 (Xi- i )2 i=l 9. a) Plotting x6) versus (13‘1(pi) yields a probability plot. Here is the data 0.051 0.133 0.214 0.296 0.378 0.459 0.541 0.622 9 0.704 10 0.786 11 0.867 12 0.949 WVG‘U‘IQOJNH"' (i-3/8)/ 12.25 nscore -1.64 —1.114 ~0.792 -0.536 —0.312 -O.102 0.102 0.312 0.536 0.792 1.114 1.635 sorted obs. 2025 3428 4005 4119 4276 4437 5240 5607 5636 5764 6318 6529 Probability Plot Concrete Data 0.00 1.00 Nscore b) Data falls very nearly on a straight line so it is perfectly consistent with a normal distribution. Observed value of probability plot correlation coefficient is 0.974, which is quite high. (One is the maximum.) If the data WERE normal, one would get an r value less than 0.94 with a probability of 10%. Thus we cannot reject normality with a PPCC test for a = 10%. CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions Final Exam 2000 Sign test uses counts only. d) H0: p(+) = p(—) = 0.5; Ha: p(+) < 0.5 2 pt e) Observe 2 positive counts (10 positive counts) p-value = Pr(K s 2 | p = 0.5 ] = (0.5)12 (1 + 12 + 12*11/2) 0.012 = 2% WHOW! 4 pt (f) For a t test, assume data is normal and consider Ho: uD = 0; Ha: pp < O r 2 pts (g) t = sqrt(12) (—0.11)/0.18 = —2.03 with df = 11. One-sided p—value is = 3.5% 5 pts (h) d = (0.12)/ 0.2 = 0.06. Look with df = 14 (for sample size 15), obtain 0 = 30%. 4 pts (i) Use Wilcoxon text. It is a powerful nonparametric that does not make unjustified assumptions. Sign tet ignores actual values of differences. We had one big negative difference. t test assumes normality, and we have no assurance that is true --Failum had a very large difference for normal data (over 2 sigmas!) suggesting the data is not normal. 4 pts (j) A p-value allows the statistician to report the attained level of significance of the data so that any other individual can determine at what significance levels the null hypothesis could be rejected. 4 pts 8. Regression a) Least squares estimators: a = —0.34; b =4.11 10 pts b) s92 = 0.381 = (0.617)2 ; se =0.617 5pts * 2=_W___ 2 _ 2_ ) R 1 (Total sum—of—squares) —1 (n 1()Se “(11305311 —0.976 (Or square correlation coefficient r = 0.988; that works too.) c) Need St.Dev.(b) = 0.13 ; t = (4.11 - 3.70)/ 0.13 = 3.12 Reject Ho: [3 = 3.7 versus HO: 0 :2 3.7 if |t| > t0.005, 24 = 2.797 < 3.12 Okay! Reject Ho - conclude NOT consistent. 7 pts. *) Is a really zero? Need St.Dev.(a) = 0.33 where a is estimator of a. We need to make a decision. Consider: Ho: (1 = 0, versus Ha: (1 ¢ 0. Compute t = (a—0)/ StDev(a) = —1.02; constant is not distinguishable from zero. ...
View Full Document

This note was uploaded on 02/02/2008 for the course CEE 3040 taught by Professor Stedinger during the Fall '08 term at Cornell.

Page1 / 8

200020Final20Exam_Solutions - CEE 304 - UNCERTAINTY...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online