199920Final20Exam_Solutions

199920Final20Exam_Solutions - CEE 304 UNCERTAINTY ANALYSIS...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam December 9, 1999 [revised 12/ 01] Exam is open notes and' open quaboolfi, It lasts 150 minutes and' tfiere are 150 points. SHOW WORK! 1. (18 points) E-mails arrive at Prof. Stedinger’s mailbox like a Poisson process with, on average, 12 e-mails arriving during the 10 am to 4 pm period. a) What is the mean and standard deviation of the number of e-mails he gets between 10 am and noon. b) What is the probability an afternoon (noon - 4 pm) passes without a message arriving? c) What is the mean and standard deviation of the waiting time until the next 5 e—mails arrive? d) What is the probability Stedinger needs to wait more than an hour for two e-mails to arrive? e) Why is a Poisson process model appropriate for this problem? f) Actually, Stedinger's Eudora program queries the Cornell post office every ten minutes and down loads any new messages. Does this effect the validity of the Possion process model for arrivals at Stedinger's computer. Why or why not? 2. (15 points) A quarry provides decorative stone cubes (almost cubes) for use in gardens. The three dimensions of the cube 1, w and h are each lognormally distributioned with mean = 1 foot, and standard deviation = 0.02 feet. What is the mean and standard deviation of the volume of the cubes if the variations among the values of l, w and h are independent? [Note V = lwh .] Is the independence assumption important (explain why or why not)? 3. (15 points) For a design you need to develop a frequency distribution for max annual 24-h rainfall depths. From regional maps you determine that the median rainfall is 1.9 inches, and the rainfall depth exceeded with a 10% probability is 2.7 inches. Using a Gumbel distribution, what rainfall depth is exceeded with a probability of just 0.5% (and thus might be called the ZOO-year rainfall)? 4. (5 points) Elizabeth is studying Cornell Lake Source Cooling project impacts. Using measurements for 8 different years, she computed that a 95% confidence interval for the true average-September lake temperature was 52.8 - 57.1 °F. Her boss has never taken a course in statistics. How would you explain to him the meaning of this interval? What does confidence mean in this situation? Why do statisticians compute confidence intervals? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam December 9, 1999 [revised 12/01] 5. (15 points) Yannis needs to evaluate how much CEE graduates know about environmental health-risk issues. To do this he wants to compute the average score of CEE graduates on a short questionnaire. However, 20% of CEE graduates took Prof. Stedinger's wonderful spring-semester course: CEE 597 Risk Analysis and Management. They should be more informed and thus get higher scores on the questionnaire. His estimate of the mean score of CEE graduates will be 5 = 0.2 >‘< + 0.8 7 where 32 is the mean of 10 randomly selected graduates who took CEE 597 and Y is the mean of 10 randomly selected graduates who did not take CEE 597. Assume that for individual test scores: Xi ~ N(ux, 0X2) and Yi ~ N(uy, 0Y2). <Students failed to compute explicitly 8: correctly the true mean score for a random student.—]RS> a) How large is the bias in S as an estimator of the true mean score of all graduates? b) What is the variance and Mean Square Error of S? c) Does 5 reflect any particular sampling strategy? What might be the advantages / disadvantages of that strategy? 6. (17 points) Students in the CEE materials course need to develop the strongest weight concrete they can for a special application. Two teams developed competing formulas, and each produced and tested their own 7 samples to obtain strengths: Sample Team Green Team Blue 1 5700 4850 2 6300 4900 3 6100 4300 4 5300 4200 5 6250 3400 6 3650 4500 7 £9.50. 529.0. Average 5464.3 4478.6 Stan. Deviation 945.0 592.9 Is there a difference between the teams? Use an appropriate Wilcoxon test: (a) What are the appropriate null and alternative hypotheses? (b) What is the rejection region for a Wilcoxon test with a type I error of 5%. (c) What is the p—value for this data set? What do you conclude? (d) What is the p-value for the corresponding 2-sample t test? (e) For a t test with a = 5%, n6 = 113 =15, when uG —uB = 1000 and (IQ = 03 = 600; what is the type I and type II error? [This problem was changed in 12/ 2001.] CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam December 9, 1999 [revised 12/ 01] 7. (30 pts) The Society of Women Engineers decided to help its members improve their study skills by sponsoring a speed reading course. The course claims that participants on average will increase their reading speed by at least 75 words per minute (wpm). Here are the before and after results for the 16 students. 1 Sally 644 748 104 29 2 Jill 448 535 87 12 3 Linda 363 395 32 -43 4 Rachael 448 531 83 8 5 Phyllis 385 453 68 -7 6 Carolyn 665 757 92 17 7 Carol 504 588 84 9 8 Sissy 399 476 77 2 9 Bonnie 674 768 94 19 10 Peggy 553 644 91 16 11 Debbie 660 752 92 17 12 Wenbi 532 621 89 14 13 Barb 448 524 76 1 14 Susan 476 556 80 5 15 Gina 685 747 62 -13 16 My 511 as 21 is mean 514.2 596.2 82.00 7.00 standard deviation 109.5 122.9 17.05 17.05 a) What are the appropriate hypotheses for a nonparametric Wilcoxon test? b) For a = 5%, what is the rejection region for the Wilcoxon test? c) What is the p—value for this sample using that test? d) What are the appropriate hypotheses for a Sign test? e) What is the p-value for this sample using a sign test? f) What are the appropriate hypotheses for a t test? g) What p-value do you obtain? h) With a = 5%, if the true mean of the differences were 100 wpm and true standard deviation were 25 wpm, what is the type II error for a t test ? i) What is a 99% confidence interval for the true reading speed of the women BEFORE they took the reading course? (Assume data is normally distributed.) j) Is it more appropriate to use a sign, t, or Wilcoxon test for this data set? Why? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 1999 Final Exam December 9, 1999 [revised 12/ 01] 8. (35 pts) A transportation research center is studying the impact of traffic volume and axle weight on road wear. Let Di be their index of the cumulated loading due to traffic, and Wi the observed road wear. Values of D and W were collected for several test tracks with a new asphalt formulation yielding: Di = 250 325 410 1840 kmile-tons Wi = 0.41 0.52 0.64 13.4 mm W=6.80 1‘) = 630 sw = 3.10 $1) = 320 2 (Di — D )(Wi - W) = 11,240 kmile-tons—mm n = 18 a) What are the least-squares estimates of a and [3 for the linear model W = a + [3 D + E ? where E represents the unexplained error, or lack of fit, for this model. b) Please compute an unbiased estimate of the variance of the errors E? c) How much of the observed variation in W is explained by your model? d) If D = 0, then one expects that W = 0. Is the data consistent with that hypothesis? (explain) <Almost no one correctly cast this as a hypothesis test as was illustrated in lecture and was required in the project. So most people lost points! - IRS> e) A new observation arrives with D = 1020 kmile-tons, but the value of W is missing. Please construct an interval which with confidence/ probability of 95% contains the missing W value coresponding to D = 1020. f) The 18 residuals from the regression analysis were plotted against <I>'1[ i-3/ 8/ (n+0.25) ] where i is the rank of each residual and (I) is the cdf of the standard normal distribution. A value of 0.891 was obtained for the correlation coefficient r between the two sets of values. Explain what this computation suggests? I hope your other tests go well. Have a happy holiday. CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING 2—sample t solutions Finals 1998-99 (Corrected 12/ 2002) CASE III: ox and cry unknown, small samples (Devore § 9.2) Final 1999. Problem #6 Two-sided t test. Z-sample treated as if unequal variances. Sample mean 5464 4479 Sample Standard dev 945 594 n 7 7 SE delta(means) 421.9 T 2.34 p—mlue 0.021 one side Df 10-10 DF rounded 10 alpha H ” t 2.228 True mean 5000 4000 True Standard dev 600 600 n 7 7 True SD delta(means) 320.7 [3: 0.12 For power calculation need Based upon normal approximation. d 0.94 =<I>[ z(alpha/ 2) - sqrt(df+1)* | d | ] use with df = 10 Using tables in book get [3 = about 20% Finding required sample size would take some trial and error efi‘ort. Too hard fi)r final. Final 1998. Problem #9 Two-sided t test. 2-sample treated as if unequal variances. Sample mean 25.23 44.28 Sample Standard dev 18.56 28.68 11 8 8 SE delta(means) 12.08 T 4.58 p-value W Df 11.99 alpha .0 ’ DF rounded 12 t 1.782 ACCEPT Ho True mean 20 45 True Standard dev 22 22 n 8 8 True SD delta(means) 11 [3: 0.27 For power calculation need Based upon normal approximation. d 0.63 = d>[ z(alpha/ 2) - sqrt(df+1)* I dl ] use with df = 12 Using tables in book get [3 = about 30% CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions Final Exam 1999 (revised ‘01) 1. Poisson process. Compute x = 12/ 6 = 2 arrivals per hour. a) For Poisson distribution, mean = u = v = At = 2*2 = 4; SD = sqrt(v) = 2 4 pts b) k=2;t=4;Pr[K=0|v=8]=exp[—8]=0.000335 2pts c) Gamma distribution (not required). k = 5; A = 2. E(T5) = 5/ 2» = 2.5 hrs; Var(T5) =5/7»2 = 5/4 = 1.25 .> so = 1.12 hrs 4 pts d) Use Gamma QR Poisson dist. Poisson: Pr[Wait 2 1 hr for 2 messages ] = Pr[ K = 0 or 1 Iv =2] = e‘V + ve‘v = (1+2)*0.135 = 0.405 3 pts e) Mail arrives singly (one at a time), arrival rate is constant, number in non- overlapping intervals are independent (for the most part). 3 pts f) Trouble because when computer down loads likely to get more than one at once. Thus mail no longer arrives singly at Stedinger's computer. 2 pts 2. Lognormal: do not needs all of log—space moments, but can calculate them. 02m = ln[1+(sL/uL)2] = (0.02)2; uL= 1 -> ulnL .—. ln(1)-0.5*021nL = 0.0002 6 pts 1n V = In L + an + In H -> Independence => Variances add. an ~ Normal( sum, SozlnL ) = N[ —0.0006, 0.0012] 3 pts #v = expwlnv + 0.5 01an) = 1 02V = [uV]2 { exp(01nV2) — 1 } = 0.0012; SD = 0.0346 3 pts When one multiplies independent logormal random variables, the result is always a lognormal distribution. However, if In L, an, and In H have a joint normal distribution with correlation among those values, In V will still be normal, and V will be lognormal. However, the variance of In V would then also depend upon the covariances. THUS the independence assumption is very important. Positive correlation among L, W and H could greatly increase variance of V. 3 pts 3. Gumbel distribution: need to manipulate Gumbel CDF to compute values of parameters: xp = u - ln[-ln(p)]/ a. Have 2 eqns. for p = 0.5 and 0.90 5 pt Yields a = 2.355; 11 = 1.744 5 pts Then for p = 0.995 get x0995 =4.0 inches in 24 hours 5 pts [Gumbel is reasonable choice because considering LARGEST rainfall event in the year, and thus in concept the largest of many storms that occurred. ] 4. Confidence Intervals. Given the variability in the data we cannot precisely determine the true mean value with such a small sample. Therefore statisticians report confidence intervals to describe possible values of the true mean with a given confidence, The intervals describe the precision with which the true mean is known. Here a 95% confidence means that when intervals are constructed in this fashion, on average, 95% of those intervals (i.e. most) will ACTUALLY contain the value of the true mean. This particular interval may or may not contain the mean; we do not know. .GEE 304 -- UNCEBIAINIY ANALYSIS—IN ENGINEERING 5 Solutions Final Exam 1999 (revised ‘02) 5-5ampling/Estimators (a) Let R be the Score of a randomly selected Student. uR = E{R} = 0.2 ax + 0.8 fly. The estimator of uR is S = 0.2 )_( + 0.8 Y . E{S} = 0.2 ux + 0.8 uy where X and Y are unbiased estimators of their means. Thus‘Snis an unbiased estimator of “R. Or, the Bias_of_S = E{S} -uR = O. 5 pts. (b) When Bias = o, MSE(S) = Var(S) = VAR[O.2 >‘< + 0.8 Y] = (0.2)2Var( 32 )+(0.8)2Var( Y ) = (0.2)2Var(X)/n+(0.8)2Var(Y) /n- for n=10 6 pts (c) This is an example of stratified sampling 2 pts If done properly it should result in a smaller variance than would be obtained by selecting 2n = 20 students at random. However, it done poorly it can actually increase the sampling variance of the estimator, or even introduce bias if the specified proportions 0.2 and 0.8 are wrong! 2 pts 6. Two-Sample Tests a) Are the two teams producing samples with different strengths? H0: PC = FB No difference 1 pts Ha: PG 7’: FB Need two sided test. b) Use Wilcoxon—Mann-Whitey Rank Sum test W = sum ranks of green (or blue), (or: 5%) reject if W 2 68 or s 7(7+7+1)— 68 = 37 4 pts c) Observe W = 2+8+10+11+12+13+14 = 70. (Data set has 2 low outliers.) p—value about 2(1.3) = 2.6%. Call it 3%. Conclude the teams different. 4 pts d) Using two sample t test find t = 2.39 with deg of freedom = 10.1 ~ 10 4 pts 2—sided P-value = 4.2%. Conclude there is a difference at the 5% level. e) With true 0’s find df = 28; d = (uG — uB)/sqrt[(df+1)(oG2/ 15 + 632/ 15)] = 0.86. For 2-sided 8: 0L=5% —> type I error = a = 5%; find type II error = B ~ 35% 4 pts 7. Paired Data => One-Sample Tests a) Use Wilcoxon Signed Rank Test on D = A-B-75. Substrating minimum gain. H0: median D = 0 Ha: median D > 0 (want to show that improvement 2 75) 2 pts (b) Sum ranks of the positive differences where big values have large ranks. Reject Ho if 5+ 2 100 (from tables) 2 pts (c) Find 5+ = 108. p—value = Pr[ 8+ 2 108] = 2% 4 pts Sign test uses counts only. d) Ho: p(+) = p(—) = 0.5; Ha: p(+) > 0.5 1 pt CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions FInaI Exam 1999 (revised '01) e) Observe 3 negative counts (13 positive counts) p—value = Pr(K s 3 l p = 0.5]: 1.1% WHOW! 4 pt (Computed binomial probabilities for p = 0.5, n = 16 and K = 0, 1, 2, 3. Gave credit if people used tables for n = 15, but hard to get right result.) Ignores that BIG difference of -43. Probably not a good analysis of data. (f) For a t test, assume data is normal and consider D + A - B - 75 Ho: MD = 0 Ha: uD > 0 (want to show that improvement 2 75) 1 pts (g) t = sqrt(16) 7/ 17.05 = 1.67 with df = 15. One-sided p-value is = 5.8% 4 pts (h) If true gain is 100, then excess over 75 is 25 wpm: d = (100-75)/ 25 = 1. For df = 15, find that (5 ~ 1% (getting close to zero but not there). 4 pts (i) 99% confidence interval: toms,“ = 2.947 (1 pt): 99% CI = x-bar i s t0_005,19/ sqrt(n) = 434 to 595 wpm. 4 pts (j) Differences do not look normal. Have a big negative value: -43. Therefore t test is suspect. Use Wilcoxon test; does not assume normality but generally as powerful as t test and more powerful than sign test because sign does use the ranks of the observations. In THIS case Wilcoxon and Sign tests both give p- value ~ 1%. 4 pts 8. Regression a) Least squares estimators: a = 2.732; b =0.0065 6 pts b) Se2 = 5.675 = (2.382)2 ; se = 2.382 4 pts 2 _ _(Residual sum-of-squares) __ _ _ 2 _ 2 _ c) Need R —- 1 mama sum_of_squares) — 1 (n k)se /[(n 1)sy ] -0.44 5pts (Or square correlation coefficient; that works too.) d) Need St.Dev.(a) = 1.27 where a is estimator of a. We need to make a decision. Consider: H0: at = 0, versus Ha: a at 0. Compute t = (a-0)/ StDev(a) = 2.15. For 16 df and t = 2.15 for two-sided test get p—value of a little less than 5%. Thus at the 5% level one can just conclude that a is not zero. With such a small sample a p-Value of 5% is about all one can expect. 5 pts Question calls for a hypothesis test, like lecture and project examples. e) For x = D = 1020 point prediction for y is 9.32, where a 95% prediction interval for a future y-value is 3.92 to 14.72, using t0_025,16 = 2.120 in 7 pts " 2 X-X a + bxi t0025,16 Se / 1+{;+—n(———)—— ZOE-i? i-l f) Plotting x6) versus ¢'1(pi) yields a probability plotlx correlation of 0.891 <<1 is very small: we can reject the hypothesis the data is normal at 1% level. 6 pts ...
View Full Document

This note was uploaded on 02/02/2008 for the course CEE 3040 taught by Professor Stedinger during the Fall '08 term at Cornell.

Page1 / 8

199920Final20Exam_Solutions - CEE 304 UNCERTAINTY ANALYSIS...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online