CEE304_Final2001 - CEE 304 UNCERTAINTY ANALYSIS IN...

Info icon This preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
Image of page 5

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 6
Image of page 7

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING ** 2001 Final Exam ** Friday, December 21, 2001 Exam is open notes and open-book. It lasts 150 minutes and there are 150 points. SHOW WORK! 1. (15 points) Domestic security is now an issue of national attention. ( More to come in CE E 597 Risk Analysis.) A water authority staff is going through its records of requests for detailed information about the structural design and . operations. Requests have come from within the US and abroad, from utilities and consultants. Might these requests be from terrorists hoping to discover where the system is vulnerable? Over the past year the utility has received 2 requests per week (2 per 5 working days). a) Using a Poisson process model, what are the mean and variance of the number of requests in a month (with 21 working days)? b) What is the probability that there are two or more requests in the first week of February (5 working days)? c) If the utility decided to examine the most recent 30 requests, what is the mean and standard deviation of how far back (in working days) that they would need to go to find 30 requests? d) Is a Poisson process likely to be a reasonable model of the arrival of requests for detailed system information ? 2. (6 points) If links in a chain have a Weibull distribution with parameter k = 4, and the mean strength of chains of length 50 m is 5,000 lbs with standard deviation 1400 lbs, what then is the mean and standard deviation of the strength of chains of length 800 m? Justify your answer. 3. After the class addressing sampling, Prof. Stedinger was asked how Gannett Health Services conducted their surveys of campus drinking. I found out. The last major cross-campus survey of undergraduate drinking patterns was the 2000 Fall Core Alcohol and Drug Survey. It proceeded as follows. The undergraduate population is known to be 51% male and 49% female. The Univeristy Registrar was asked to randomly select the names of 965 males and 735 females corresponding to 1700 registered undergradutes. They were all mailed surveys. Some 651 valid responses to the drinking question were received. [Some surveys were returned by the Post Office (144), many surveys were not returned (835), and some were returned with invalid responses (63 w/ 0 sex reported, 7 no answer or multiple answers).] The valid response rate was 38%. Here are those numbers and the response to the drinking question: CEE 304 - UNCERTAIN TY ANALYSIS IN ENGINEERING ** 2001 Final Exam ** Friday, December 21, 2001 ----------- Survey Summary ----------- Male Eemalc Intel Emctichalg Surveys sent 965 735 1700 57% Valid responses 332 319 651 51% Response rate' 34% 43% 38% --- What is the average number of drinks you consume per week? mean 7.578 3.347 5.632 st. dev 11.760 4.982 9.23 The variance for the men is very large because some 4% report averaging more than 40 drinks/ week. No women reported averaging more than 28 drinks/ week. [More men were sampled because their response rate is generally lower.] a) (8 points) For men and women the frequency distribution of the responses was [note the clumps at 0, 8, 10, 12 and 15 (for men)]: o in = e a. in 0 g I! 1o 11 12 13 14 15 16 17 18 >19 5Average 8# Drinks per Week If you wanted to construct a compact and smoothed represention of these responses (perhaps to estimate Pr[ D > 15 ]) using a nice analytical distribution, what would be an appropriate choice and WHY that one and not an alternative? b) (7 points) Use this data to construct a 95% confidence for the true value of the average number of drinks a women undergraduates at Cornell consumes per week. What is the probability the true mean of the average number of drinks consumed by Cornell women is the interval you just computed? c) (7 points) What is the rejection region of a t test of whether men and women drink at the same rate? What p-value do you obtain? CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING ** 2001 Final Exam ** Friday, December 21, 2001 [Note that women on average have less body weight and thus need to drink less to obtain the same dose; and due to physiological difference women experience more of an effect from the same dose] d) (3 points) Is the test in (c) likely to be valid with this data. [Note, 5 of 332 men reported averages in excess of 60 drinks/ week] e) (10 points) The overall mean of the average number of drinks/ week was obtained by pooling the data sets for men and women to obtain one large sample. i) Is this the best estimator of that average that could be constructed? ii) Please comment on the sampling strategy they employed -- Are there potentially any important problems? Is bias likely to be a concern? What would you recommend they do differently? 4. (20 points) A coed living unit was suprized by the differences in men’s and women’s drinking habits as described by the University survey. They surveyed their members and obtained avg std.dev Men: 0, 0.7, 2, 6, 8, 10, 50 --> 10.93 17.63 Women: 0, 0, 0.5, 1, 3, 5 --> 1.58 2.01 For an appropriate non-parametric test: (This data sets has several ties, so in your computation, assign all 3 zero responses their average rank equal to 2) (a) What are the appropriate null and alternative hypotheses? (b) What is the rejection region for a test with a type I error of 5%. (c) What is the p-value for this data set? What do you conclude? (d) What value of t do you obtain for a two—sample t test. What is the p-value? (e) If challenged, could you, and if you could, how would you justify use of a non-parametric test instead of a t test with this data? 5. (20 points) Prof. Brutasert uses satellite instrumentation to measure average ground-surface temperature. A new sensor should provide smaller errors in comparison with ground-based observations. The measurement errors [here Oldi or Newi = (Measuredi - Truei) in °C ] obtained with the Old and the New instrument are contained in the table below. Consider whether the absolute value of the measurement errors with the new instrument are indeed smaller. CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING ** 2001 Final Exam ** Friday, December 21, 2001 Actual Error Absolute Error D = difference Irial .0151 Nm .LQldJ. 11mm 19mm 1 0.42 0.37 ‘ 0.42 0.37 0.06 2 0.01 0.02 0.01 0.02 — 0.01 3 -— 0.64 - 0.40 0.64 0.40 0.24 4 - 0.60 — 0.42 0.60 0.42 0.18 5 0.00 0.10 0.00 0.10 — 0.10 6 4.10 2.14 4.10 2.14 1.96 7 0.32 0.10 0.32 0.10 0.21 8 - 0.01 — 0.07 0.01 0.07 —— 0.06 9 - 1.69 — 1.06 1.69 1.06 0.64 10 -— 2.68 - 1.51 2.68 1.51 1.17 Average — 0.076 — 0.071 1.048 0.618 0.430 St. Dev. 1.759 0.967 1.372 0.718 0.661 (a) What are the appropriate null and alternative hypotheses? (b) What is the rejection region for a sign test with a type I error of about 5% ? What p-value do you obtain with this data? If the probability of a negative sign is really only 0.3, what would the type 11 error be for your 0L=5% test? [For n = 10, Devore has binomial tables. ] (c) What is rejection region for appropriate non-parametric test with or = 5%? What p-value do you obtain with this data? ((1) What is the rejection region for a t test with on = 5%? What p-value do you obtain with this data? (e) How big would the mean “D = E{ lOldi I — lNewi l} with the t test need to be to have a type H error < 10% were 6D for the differences equal to 0.7 ? (f) How large a sample is needed to ensure B < 5% with the t test when 6D is 0.7 and the mean difference “D = 0.4 ? 6. (14 points) In the design of a pier, the structural engineer needs to determine the character of earth surrounding the pilings and its ability to support loads. Several samples were collected and the following resistances measured: CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING ** 2001 Final Exam ** Friday, December 21, 2001 45, 75, 27, 45, 34, 43, 44, 35, 88, 40, 48, 65, 54, 56, 42 (a) Plot these data on the attached probability paper. (b For the probability-plot correlation test of normality, _ specify the approximate rejection region for an or = 5% test. (c) The r for the probability plot correlation test (correlation of data with their nscores) was 0.948; r for the probability plot correlation test on the logarithms of the observations was 0.982. What do you conclude from these results? Explain. 7. (35 pts) Prof. O’ Rourke is studying the effect of soil properties on the earthquake damage experienced by civil infrastrure including pipelines and buildings. Let Ei be a measure of likely earthquake intensity that reflects local soil properties, and Di a measure of the damage to residential houses in a quake. Here are some possible data: Quake intensity, E: 30.0 5.1 117.1 122.2 12.1 174.4 Damage, Di: 12.7 4.6 38.2 25.1 14.5 37.3 15 = 17.72 n=24 E =67.71 SD = 14.28 sE = 57.50 2(Di- i5 )(Ei- E) = 11,400 a) Consider a linear model. What are the least-squares estimates of 0c and Bfor D = oc+l3 E + 8 where 8 is the model error? b) Compute an unbiased estimate of the variance of 8. c) What fraction of the observed variability in D is explained by E? d) What is the standard deviation of b, the estimator offi? ‘ e) What is the rejection region for a 5% test of whether [3:0? Why have you selected either a one-sided or two-sided test? How small an on could one use and still reject Ho with this data set? f) Using the values of the parameter that you obtained, what is a 90% prediction interval for D at a site with an earthquake intensity of E = 120. 8. (5 pts) What fundamental problem does the collection of knowledge and mathematical relationships called ”statistics” address ? I liope you enjoy tfie 50ny season. Drive safefy. .- Jery Stetfiryer CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutions Final Exam 2001 1. Poisson process. Compute 7» = 2 / 5 = 0.4 arrivals per day. 1 pts a) For Poisson distribution, a = v = M = 21 (0.4) = 8.4 = Var(K) 3 pts b) Pr[K22 Iv =5(0.4)=2] = 1—Pr[K=Oor1]=1-e"’—ve“’ =0.593 3pts c) Gamma distribution (not required). k = 30; 7» = 0.4. E(T30) = 30 / it = 75 days; Var(T30) =30/x2 = 187.5 daysz ..> SD = 13.7 days 4 pts d) They arrive separately (ignore time of day when mail arrives at office), independently in time for different groups, with constant rate. 4 pts 2. Weibull for Weakest Links. Lectures found that for chain of length n with Weibul distributed links, resulting Weibull for chain has: u = v/ nl/k. (2 pts) We have k = 4, and effectively n = 800/50 = 16. Thus change in u is factor of 2 (2 pts). Using E[Y]=uy=uI‘(1 +%);Var[Y] =oY2 = u2[r(1 +12;)—r2(1 +4113] Longer (weaker) chain has )1 = (5000)/ 2 = 2,500 lbs 8: o = (1400)/ 2 = 700 lbs. (2 pts) 3. Surveys, Confidence Intervals Distributions. a) Averages are continuous so that rules out Bernouli values of 0 8: 1), binomial (no n), Poisson (variance = mean so a too small), Geometric (variance < mean so a too small). Plus they are discrete anyway. Gumbel and normal fail because need x 2 0. Lognormal has density function of zero at zero so fails to capture peak. Gamma with a < 1, or Weibull with k < 1 are reasonable. See which fits better if we can tell. (Weibull has thicker tail.) Neither reproduces the actual non-zero probability that some people never drink and thus have a true average of exactly zero. 8 pts b) 2 = 1.96 so 95% CI = x-bar i z s/ sqrt(319) = 2.80 to 3.89 drinks/ week. 5 pts This interval may or may not contain the true average; we do not know. 2 pts c) reject Ho if Z = l W l > 20.025 = 1.960 for a 2-sided 5% test 2 pts . / §§ + §i nx ny Obtain t = 6.02. P-value < 0.0004 (front cover). Not a lot of doubt here. 5 pts d) Valid. Yes, there are big outliers. But when one averages 300 observations, the resulting average IS normal, regardless of the original distribution of the data (provided a < 00). (2 pts) That is the Central Limit Theorem. (1 pt) e) They have not done the right thing. They have collected (appropriately) a stratified sample, and should analyze it as such. Combing the two samples and pretending it is a Simple Random Sample exaggerates the variance in the estimators of the mean. One SHOULD collect a stratified sample to take advantage of the differences between men and women. Furthermore, because men have such a larger variance, one should sample many more men than women and then weight the men’s and women’s averages by 0.51 and 0.49 (their proportion in the student body). It may also be useful to stratify by class (Fr., Soph, Jun. Sen), or student age (18, 19, ..., 21, 21+). Why not? It cost little to do so and increases precision. There is a tremendous opportunity for bias in this analysis. While the original sample from the registrar is a Simple Random Sample, there is first a non-response bias for surveys returned, and then there is a voluntary response bias because only about half of the students return the surveys. To solve this problem we need a survey approach that has a greater return: survey by phone or offer a gift for returned surveys. There may also be an honesty problem: do drinkers brag, or CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutlons Final Exam 2001 exaggerate to skew the survey to justify their behavior? 10 pts. 4. Two-Sample Tests a) Drinking: men versus women H0: Fmen = Fwomen No difference 3 pts Ha: Fmen > Fwomen Want to check previous independent survey. b) Use Wilcoxon-Mann—Whitey Rank Sum test W = sum ranks of women. Expect them to have small ranks so need to convert the value given in the tables so can reject Ho if wwomen s 6(14) - 54 = 30 3 pts c) Observe W = 31. p-value a little larger than 5%. Maybe 6-7% 2+2 pts Accept Ho 1 pt d) UseT = I (X‘Y)"(”"2’”) iFindt=1.40 2pts 4 / §§ + it with df = 6.18 or 6.0. P-value = 11% (one-sided) 2+2 pts e) Larger survey demonstrates that the averages are NOT normal. See the big values and spike in histogram at zero. A small-sample t test is just not valid. Go with a non-parametric test. Has good efficiency & avoids incorrect assumptions. 3 pts 5. Paired Data => One-Sample Tests a) Use test on differences D. Ho: median D = 0 (or H0: Fold = Fnew) Both mean no difference. Ha: median D > 0 (expect Old - New > 0) 2 pts b) Sign test uses counts only. H0: p(+) = p(-) = 0.5; Ha: p(-) < 0.5 From tables for (1 ~ 5%, reject if S s 2 => (1 = 5.5%. 2 pts Observe S = 3, p-values = Pr[ 5 s 3 ] = 17.2% 2 pts WereS~Bin(10, p=0.3), fl =pr{Sz3]= 1-Pr[SsZ]=62% 2pts c) Use Wilcoxon Signed Rank Test on Sum ranks of the positive differences where big values have large ranks. Reject Ho if 5+ 2 44 2 pts Find 5+ = 47. p—value = Pr[ 5_ 2 47] ~ 2.4% 2 pts (1) For a t test, assume data is normal and consider H0: “D = 0; Ha: “D > 0; Reject if T > t0.05,9 = 1.833 2 pts t = sqrt(10) (0.43)/ 0.66 = 2.06 with df = 9. 1-sided p-value = 3.3% 2 pts e) For a = 5%, 1-sided t for df = 9, want 3 < 10%, need d = ( ua-O )/o = 1.1 with o = 0.7 yields “a = 0.7(1.1) = 0.77 2 pts f) For a = 5%, 1-sided t to ensure [3 < 5% with d = 0.4/0.7 = 0.57 ~ 0.6 find from table A.17 that df = 34 for n = 35. 2 pts 6. a) (8 pts) Plotting x i) versus <I>'1(pi) yields a probability plot. Here is the data i ii-3/8)/ 15.25 Nscore x(i) log x(i) , 1 0.041 -1.74 27 1.44 2 0.107 -1.25 34 1.54 CEE 304 - UNCERTAINTY ANALYSIS IN ENGINEERING Solutlons Flnal Exam 2001 ..7 0.434 -0.17 8 0.500 0.00 9 0.566 0.17 ..13 0.828 0.95 14 0.893 1.25 15 0.959 1.74 44 45 45 65 75 88 Probability Plot Geotech Data Obseruaum -2.00 4.00 0.00 Nscore 1.00 1. 64 1. 65 1. 66 1.81 1.88 1.94 b) Table A.12 page 739 Devore5 indicates reject hypothesis of normality forrs0.9383 tom: 15 ata=5%. (2 pts) c) r- — 0.948 > 0.9383 so cannot reject normality. However, r- — 0.982 obtained with the the logarithms of the data looks better. Data does curve smoothly upward 1n graph suggesting a more highly skewed distribution than normal. It would be reasonable to uses a lognormal with data, but cannot reject normality. 7. Regression (35 points - make sure you get to this one!) a) Least squares estimators: a = 7.567; b) se2 = 135— - (0. 617)2; se = 11. 64 C) R2: 1_ (Residual sum-of—squares) (Total sum-of—squares) = 1 (n—k)se2/ [(n-1)sy2] =0.356 = r2 = (0.604)2 b = 0.150 d) Need St.Dev.(b) = se/sqrt((n—1)sx2) = 0.042 e) Reject Ho: B = 0 versus Ho: [5 >0 if t > t0.05, 22 = 1.717 Okay! Observe t = 3.552. p-value close to 0.001 (actually 0.0009) (4 pts) 8 pts 4 pts 4 pts 4 pts 2+1 pts 2+2 pts f) or E = 120 the mean value of D = a + bC = 25.6; a 90% prediction interval for a future D-value is 4.82 to 46.32, using t0.05,22 = 1.717 (w/ SE-prediction = 12.08) in a+bx:l: t0.05,22 Se 1+%+ (X-X)2 i (Xi-i )2 i—l 8pts 8. Statistics addresses the selection, summary a—nd interpretation of observations to infer characteristics of the real world. This requires descriptions of the precision / uncertainty of our estimators. What do we know and how well do we know it? Statistics includes inference which is an approach to decision making (science w/ data). “Statistics is an effort to understand data and draw conclusions from it.” ...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern