CEE304_Final2003 - CEE 304 - UNCERTAINTY ANALYSIS IN...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2003 Final Examination 9:00 - 11:30, Thursday, December 18, 2003 (revised) Exam is open notes and open-book. It lasts 150 minutes and there are 150 points. PLEASE - SHOW WORK! 1. (12 pts -- 2 pts each.) Short and true-or-false answers a. The Sample Space is a collection of all possible outcomes of an experiment (T/F). b.The great power of the idea of an estimator is that we can determine which of several estimation procedures will always give the better answer. (T/F). c. The Central Limit Theorem assures us that the results of experiments will always be normally distributed (T/F). ' d.The sample variance is an unbiased estimator of the population variance. (T/F) e.The sample average of n observations has a standard deviation of 7 (Ans in exam bk.) f. For normal data, the standard deviation of the sample variance S2 is ? (Ans in exam bk.) 2. (6 pts) What fundamental problem does the collection of knowledge and mathematical relationships called “probability theory” address? Why is the concept of a random variable so important? 3. (7 pts) A hydrologist is modeling the largest 6-hour rainfall depth observed in each year of record. Assuming that observed annual maxima have a Gumbel with mean 2.3 inches and standard deviation 0.8 inches, what rainfall depth has only a 0.005 probability (0.5%) of being exceeded in any year? WHY is a Gumbel distribution a reasonable model for such phenomena? 4. (15 pts.) A BEE student needs to create a stock solution of a reagent that should have a concentration of 2 mg/l. The previous night they were up to 5 am doing CEE 304 homework. As a result, when they measure the reagent, the mean weight of the reagent they obtain is 4.00 milligrams with a standard deviation of 0. 12 milligrams. They add the reagent to a measured volume of water, which has a MEDIAN of 2.00 liters and a Coefficient of Variation of 5%. Using lognormal models, what are the mean and standard deviation of the concentration of the reagent in his solution? 5. (14 pts) Computer security is a tremendous problem. A civil engineering consulting firm has installed new computer security software to protect their files and web site. They want their clients to be able to access files describing the status of ongoing projects and proposed designs, but the information should not be made public or damaged. The new software detects on average 12 serious and independent attempts to penetrate the system every hour. a) Why might a Poisson process be a reasonable model of such attacks? Why might such a model not be appropriate? Assuming a Poisson process b) What is the probably that there are exactly two attacks in a specific 10 minute period? c) What are the mean and variance of the number of attacks in a week (168 hours)? d) What is the mean & variance of the time until the system experiences 1,000 attacks? Page 1 CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2003 Final Examination 9:00 - 11:30, Thursday, December 18, 2003 (revised) 6. (20 pts) A common and natural problem addressed by statistical tests is whether two samples show that the respective populations are different. Over the years Prof. Stedinger has come to believe that women are generally more conscientious at doing homework than men. For an interesting example he grabbed some average homework grades for assignments 1-11 in CEE 304 this year, excluded graduate students, and sorted the scores by sex. The real numbers are: \OMNG‘UIhWNr—I Nu—ou—IHt—Iv—Ir—Iu—It—Ir—It— OWOONQlthwND-‘O average st. dev Count Sum (a) What are the appropriate null and alternative hypotheses for a good non-parametric test? (b) What is the rejection region for a non-parametric test with a type I error of 5%. (c) What is the p-value for this data set? What do you conclude? (d) What value of t do you obtain for a two-sample t test? What are the effective degrees of freedom (approximately)? What is the p-value? (e) Would you recommend a non-parametric test or a t test with this data set? WHY? Page 2 CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2003 Final Examination 9:00 - 11:30, Thursday, December 18, 2003 (revised) 7. (25 pts) Professor Watchum is concern with the ability of his students to use a new inexpensive laboratory instrument. If they are not careful, they are likely to get answers that are too large. To test his concern he observes results from different student teams using the instrument for several experiments. Below are the values they should have recorded (measured independently by Prof. Watchum with an expensive and 1911 precise instrument), and the values recorded by the students. True Value #— 2 3 4 5 6 7 8 9 H 0 Average St. Deviation Last two columns provide the ranks of the absolute value of differences. (a) Specify appropriate null and alternative hypotheses for the tests in parts (b)-(c)-(d) below. (b) What is the rejection region for a sign test with a type I error of about 5%? What p—value do you obtain with this data? If the probability of a positive Sign is really 0.8 (the model tends to overpredict), what would the type 11 error be for your on = 5% test? [For n = 10, Devore has binomial tables. ] (c) What is rejection region with on = 5% for the appropriate powerful non-parametric test? What p—value do you obtain with this data? (d) What is the rejection region for a t test with or = 5% for this data? What p-value do you obtain with this data set? (e) With normal data, how large a sample is needed to ensure the type 11 error is < 2% with the ttest when O'Diff is 100 and the mean difference is actually uDiff= 100 ? (f) If the data is normal, for n = 10 how big a mean difference is needed to ensure B 13 O with a t test were the standard deviation of the differences really equal to 100 ? (g) For this particular data set, which test (sign, W-Wilcoxon, or Student t) would you recommend and YLHX? (Type II engineering!) 8. (5 pts) Assuming the data in #7 are normal, use a Student t distribution to construct a 95% confidence interval for the mean difference between the measured and true values. Page 3 CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2003 Final Examination 9:00 - 11:30, Thursday, December 18, 2003 (revised) 9. (8 pts) One of the students in Prof. Watchum’s class asked how Watchum knew the differences were normal. (a) Construct a probability plot to allow him to examine the hypothesis of normality (probability-paper attached). (b) The correlation between the sorted observations and the nscores is 0.8656 --— what should Prof. Watchum conclude? 10. (38 pts) Winter is here. A hydrographer is concerned with the variation of snowpack in the mountains with altitude and exposure to prevailing winds. Just considering altitude for slopes facing the source of moisture, she obtain the following data on the snowpack depth (D) in millimeters on the first of May, and altitude (H) in meters: Altitude, Hi: 2440 3720 3420 3180 4150 1720 meters Depth, Di: 1395 1304 872 1041 1556 622 millimeters n=27 b‘= 1164mm sD=368 H = 2761 m sH= 727.8 2(Hi- E )(Di- D‘)=2,951,520 a) What are the 1east~squares estimates of the coefficients of the linear model D = or + l3 H + 8 wherein the term 8 describes errors about the model prediction ? b) Compute an unbiased estimate of the variance of 8. c) What fraction of the observed variability in snowpack depth is explained by altitude? d) What is the value of adjusted-R2, and why does its value differ from the number you reported in part (c) above? e) What is the standard deviation of b, the estimator of [3? f) A concern is whether there is any relationship between altitude and snowpack depth. What is the rejection region for a 1% test of whether or not B is zero? WHY have you selected either a one-sided or two~sided test? What p-value do you obtain with this data set? g) What is a 95% prediction interval for the snowpack depth that could be measured at a particular site with an altitude of 3,000 meters? Tits flas fieen a fun class this year. Qfianfiyou for joining me. find rememfier, wfiil'e everyone need}; to fie [ifie R06 andflrt sometimes, tfie Micfiefl'es are tfie leaders witli vision wfio mafie tfie world' worfifietter. Page 4 \1\\\ ‘ i‘ww \ x I: .‘w ‘\|J . . . _ V. . .. “ . ...V . .... .Z. .. V. .2. I h... .. ..V .... .m. u. w .3“ .... u... .w M t“ . M. .... W. w I .... ... .: ..... .. ..uu.d .: . . . . . , . \ .... ...H V _ u.fl.m .... 1," .. . — ...“...mimmtxx. _. V . _ u .. . _ .. ._. .. . .I . . I . . ... In.”- V,.. ....II. .... .. . . .. ... . . . . .... ......T ....V. .... .V * I I . V ... .. . I .m.. ..."... u. . . ._ . . I . .. . .. . .. ._ . .V. . .. .....V. _. .... ,. . , .. . . . . I . _ .. ..I. . .I. V... .... .. _ .. .... . ._ “I ._“u .m” . . .... .. . . ... ._ ... ..V _ f ...m.. . a v V... ....III. .V. .. .... .. m .11.. .,. IV . . .I V .... ...V. . . . .. 7:... ..V. ... .. . .5.“ . d. . _ . I V .. .. .I . . .. ... ... .I j. n .. 0...”: .n.. ..I. ..I. . .. .. . : . .... ..u __ .... .... ....._. .... ...} .I . ..am .3 ..... ..I. l." .... . ....-. . I .. . ... m4..I:.V.. .v~ L ._ I . ,I .. I . . . u. .... .... .... .... I. w .v .. 4 n ..M“. I.. "4.“ ... . ... a b V ..I . ..u.. .... ...... v m . .. .... .V. ..., ...” V. ......d . .. _ .I _ I . . ..m. I .. m... ... MM.“ .aV ....N a a a_ _ ... H_.I .... ...... . u . .... ... M. . “4.. _« .... ..mfl .t... .. .. ...V . . ._ ,V. ... ...... w I I . ,. ... , .II.,.. ...V ..d. V. w . : .... ....u. .... ...”...I .I." .... _. . ..:. . n _“ . . .... m..._;V .... ....V I... .... ...: .. m ... .. .... .. .fl.. .... .. q... .. ,.. ...m .. . I... m... I... Z .... .. . . . . . . . ... ..V. VI .. . ... .. . . .... .I.. ._.. .V . _. .. ..... .. .m . w . _ ._ ._ . . I _ .4 ...“ .. .... V H... N _ _. . . . . . _. . . I ..........V. ..-... .... ......w- _ V . . . . . V V . . I . ..W. : .... Nd“. ..... an“. .... m... _. — ..-"... ...I.. ..n..- a.“ ... .I. ~ I . .. ...: “V. L . . ....-. .... ... .... .. . . . I . V . . I.. .V .... . . Z .I . . . .. _ . . m ... ,... ... ..... ... n. .... .. .... ..m...I..".I...... .. ...... a... ...... ......«..... .. . .. I... n.“ _ _ ... _ . _ III. =__=_-_—_a==- I-E=_____===__===- a , E-a_=_____an—=__===- _. .,______fl-____ _=_ 555: =___=u_________________Mm_mmmm Egg—En Emu” _ n_________=m ., fifififim .—___._-.____________m_u___m____m . . 3.. ... 2 a 2. .... a... 3-3:: 2.22.. a = :===__= lllllnllllllnllllllul Q EV 2 F 2 m0 DR. MI. .. .... 3.5;... Eliii 2925.58 we...on 922% El —._, CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions FlnaI Exam 2003 1. Short answer: True, No (for a articular sample, different things can happen), No, Yes, 6/ n, sqrt[204 /(n-1) ]. 2. Probability: When the outcome of an experiment (whether it be a measurement, an act of nature, or a game of chance) is uncertain, we need a language to describe the likelihood of different outcomes, or the uncertainty associated with the outcome of the experiment. Probability addresses how from the rules by which points are selected in a sample space we can compute the likelihood of different sets of possible outcomes. A random variable is a numerical measure of the outcome whose probability distribution (describing the likelihood different values occur) can be determined. It is a real-value function which assigns a real number to every point in the sample space. Thus it links in a valuable way the probability of different events to numerical values addressing issues of concern in engineering, and thus allows us to compute the probability that various numerically described outcomes might occur. 3. Gumbel: 7 pts maximum: points lost if answer not provided. Given u = 2.3 and o = 0.8, need to get Gumbel: xp = u - 1n[-ln(p)] / a. First 62 = (n/ 6a2) => a = sqrt(n/ 6)/ 0.8 = 1.603 2 pts Second u = u + 0.5572/ a => u = u - 0.5572/ a = 1.940 2 pts Then for p = 0.995 get x0.995 = u - 1n[-ln(p)] / a = 5.24 3 pts Gumbel appropriate because the largest of many values, whose distribution is unbounded above. 2 pts 4. Lognormal - This is just like lecture example, and other examples. First get log-space moments. Consider the Weight W of reagent, and the Volume V of water into which it is mixed, where as a result C = W/ V. 0211‘“, = ln[1+(ow/ uw)2] = 1n[1+(0.12/ 4.00)2] = (0.030)2; H an = ln(flw) ' 0-5 02111 => U 1nW = 1.386 5 Pts For C it is easier because u luv = ln(median of V) = ln(2) = 0.693 621,“, = ln[1+'('0;05)2] = '(0.‘050)2; 5 pts in C = In W — an + -> Independence => Variances add. lnC ~ Normal(u lnw — u lnv ‘, 62W, + 621,“, ) = N[ 0.693, (0.058)2 ] ' 1 pts uc = exp(u me + 0.5 62m) = 2.002 02C = [uc] 2 { exp(62,nc) — 1 } = 0.0136; CC = 0.117; CV = 0.058 4 pts When one multiplies independent logormal random variables, the result is always a lognormal distribution. This a nice model for products and ratios. CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions Final Exam 2003 5. Poisson process. a) Would be Poisson if arrivals are independenly over time with constant rate. Would not be Poisson if arrival rate changes over time, or if attackers coordinate their attack for some period of time. 5 pts b) For Poisson process, 7» = 12 per hour. Pr[ K: 2 | V: 12(1/ 6) = 0.1667]: v2 exp(—v) / 2! = 0.27 3 pts c) Number in a week has a Poisson distribution u =oz=v=M=12(168)=2016 3pts d) Gamma dist. (not required). k = 1000; E(T50) = 1000/ 7L = 83.33 hrs; Var(T50) = moo/7.2 = 6.94 hrs2 —> SD = 2.64 hrs 3 pts 6. Two-Sample Nonparametric and t Tests a) H0: Mwomen = Mmen No difference in medians 2 pts Ha: Mwomen > Mmen Women do better. Men stochastically smaller. b) Use Wilcoxon—Mann—Whitey Rank Sum test. W = sum ranks of women. Expect them to have large ranks. E[W] = 323; Var[W] = 1076 =(32.8)2 Reject H0 if 2mt > 1.645 or equivalently W 2 323 + 32.8(1.645) = 377 4 pts Wherein Ztest = [ W — EW]/ SD[W]; EW = 323, SD[W]=32.8 c) Observe W = 377. P-value = 5%; Barely REIECT Ho 2 pts Those women are conscientious. d) Lots of computation. See formulas in notes. Rejection region T < — tam4 = —1.691 Find t = -1.52; d-of-f = 33.88 ~ 34. P-value 7% (one-sided) 8 pts 6) Why should this data be? Look at the terrible scores some students get: 9 points and even the 403; while many others in 808, 90s and approaching 100. Data clearly not normal. Use a nonparametric W test and avoid unneeded and incorrect assumption of normality. T test would not have the advertised type I error. Moreover, W test likely to be MORE powerful that a t test. 4 pts 7. Paired Data => One-Sample Tests a) Different answers for each part b) Sign test uses counts only. Consider S, the number of negative signs. 6 pts Ho: p(+) = p(-) = 0.5; Ha: p(-) < 0.5 From tables for (X ~ 5%, reject if S s 2 [actual o: = 5.5%]. 1 Observe s = 2, p—values = 5.5% Were 8 ~ Bin(10, p = 0.2), B = Pr[ S 2 3 ] = 1- 0.678 = 32% CEE 304 - ‘UNCERTAINTY ANALYSIS IN ENGINEERING Solutions FInaI Exam 2003 b) Use Wilcoxon Signed Rank Test 6 pts Use test on differences D = ( Observed - Estimated ). Sum ranks of the positive differences Ho: median D = 0 01' Ho: Pr[-Value = FMeS Both imply no difference. Ha: median D > 0 or Ha: FTValue > FMeS where big values have large ranks. Reject H0 if S+ 2 44 Observe 8+ = 47; Pr[ 5+ 2 47 ] = 0.024 => P-value = 0.024. REJECT d) For a t test, assume data is normal and consider 5 pts HO: [1 D = 0; Ha: [I D > 0; Reject if T > t0.05,9 = 1.833 t = Ji—o (90) / 175.9 = 1.626 with df = 9. p—value = 0.0692 = 7% from Excel. Must accept HQ. _ That is interesting. W test sees a very significant result. T test does not. The data is not normal so t test is not very eflicient, if it is even correct. c) For d = uD/GD = 100/ 100 = 1 and 0; = 5%, with a 1-sided tto get {3 < 2% need df = 14, hence need n = 15 (using Table A—17) 2 pts f) For a = 5%, n = 10 and a 1-sided 't to ensure'B ~‘0% need 'd =" 1.4 (find from table A.17) so that uD = 1.4 (ID = 140 2 pts g) Use the nonparametric signed-rank test — clearly the data are 4 pts not normal which rules out a small-sample t test: it is neither valid (type I error will not have anticipated value) or efficient; sign test is not very powerful: ignores magnitude of differences. 8. Confidence Intervals toms,9 = 2.262 so a 95% CI is 90.4 :I: 2.262 (175.9)/ J1— : —35.4 to 216.2 5 pts BUT I would not BELIEVE this CI -- data is not normal. (Or CI is: —216 to +35) 9. Probability plot- here is the data using using pi = (i- 3/ 8)/ (n + 1 / 4) pi Z(pi) X0) 1 0.061 ~1.547 -102.8 2 0.159 -1.000 -6.2 3 0.256 -0555 18.0 4 0.354 -0.375 20.2 5 0.451 -0.123 20.3 6 0.549 0.123 23.1 7 0.646 0.375 27.3 8 0.744 0.655 169.2 9 0.841 1.000 212.1 10 0.939 1.547 522.9 Corrrelation is 0.8656; this is less than the critical point for 1%: P-value < 1%! CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions Final Exam 2003 Probability Plot Watchum Data Set So reject normality! This data is not normal. 10. Regression a) Least squares estimators: a = 573; b = 0.214 8 pts b) S; = 115,800 = (340)2 4 pts c) R2 = 1 - (n-k)sezl [(n-1)sy2] = 0.179 4 pts d) 732 = 1 - se2/ sy2 = 0.146; has a smaller value because it corrects for the number of coefficients estimated. Good for multivariate regression 4 pts e) Compute St.Dev.(b) = 0.0917 4 pts f) Reject Ho: [3 = 0 versus Ha: B > 0 if T > tQOL 25 = 2.485 I used a one-sided test because almost always snowpack increases with altitude. But others may use a two sided test if they did not have this prior belief. Observe t = 2.337. p-value close to 1.4% (> 1%) so accept Ho. 7 pts For two-sided test, t0.005l 25: 2.787; p—value for 2.337 is 2.8% g) For H = 3000 m the mean value ofD = a +b 3000 = 1,216 mm; a 95% prediction interval for a value of D to be measured at H = 3000 at this value of w is 500.5 to 1,930.1, using t0.025,25 = 2.060 (w/ SE from the mean of 347.2) in the equation: I 1 (AI—17:?)2 V a + bHi 0102535 se 1+" + 2(HFJ—UZ 7pts NOTE: 95% for the mean when H = 3000 is 1,073 to 1,358, which is smaller! ,/77 ...
View Full Document

This note was uploaded on 02/02/2008 for the course CEE 3040 taught by Professor Stedinger during the Fall '08 term at Cornell University (Engineering School).

Page1 / 9

CEE304_Final2003 - CEE 304 - UNCERTAINTY ANALYSIS IN...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online