ST102 2009.pdf

ST102 2009.pdf - lSE Summer 2009 examination ST102...

Info icon This preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 6
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
Image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: lSE Summer 2009 examination ST102 Elementary Statistical Theory 2008/9 and 2007/8 syllabuses only Instructions to candidates Time allowed: 3 hours Full marks may be obtained for complete answers to FIVE equations. Answer not more than THREE questions from Section A and not more than THREE questions from Section B. Final mark will be capped at 100 in the event of more than 100 marks awarded. You are supplied with: Graph Paper Murdoch & Barnes Statistical Tables (4th Edition) You may also use: A hand held calculator which however must not be pre—programmed or able to display graphics, text or algebraic equations. The make and type of machine must be stated clearly on the front cover of the answer book. ©LSE 2009/ST102 Page1 of 10 Section A l. a) Suppose A and B are events in a sample space S with p(A) > 0 and p(B) > 0. i) State what it means for A and B to be mutually exclusive. [1 mark] ii) State what it means for A and B to be independent. [1 mark] iii) . Show that A and B cannot be both mutually exclusive and independent. [3 marks] iv) Prove that if A and B are independent, then AC and BC are independent, where AC and BC are the events complementary to A and B respectively. [5 marks] v) Give an example to show that if A and B are mutually exclusive, AC and BC are not necessarily mutually exclusive. [2 marks] [You may assume that for any events X and Y, p(X U Y) = p(X) + p(Y) - p(X F) Y)] b) State Bayes’ Theorem (no proof required). [2 marks] Rebekah is going to a party. There is a 40% chance that Matt will go. If Matt does not go, there is an 80% chance that Rebekah will enjoy herself. If Matt goes, there is only a 10% chance that she will enjoy herself. i) What is the probability that Rebekah will enjoy the party? [2 marks] ii) Suppose you know that she enjoyed herself. What is the probability that Matt was not present? [2 marks] iii) Suppose you know that she did not enjoy herself. What is the probability that Matt was present? [2 marks] ©LSE 2009/ST102 Page 2 of 10 2. a) Suppose a random variable X has the binomial distribution with parameters n and p. Show that the moment generating function of X is given by m(t) = (q + pet)n where q = 1 — p. Hence, or otherwise, find the mean and variance of X in terms of n and p. [9 marks] b) You may assume that 15% of individuals in a large population are left-handed. i) If a random sample of 40 individuals is taken, find the probability that exactly 6 are left-handed. [2 marks] ii) If a random sample of 400 individuals is taken, find the probability that exactly 60 are left-handed by using a suitable approximation. Briefly discuss the appropriateness of the approximation. [5 marks] iii) What is the smallest possible size of a randomly chosen sample if we wish to be 99% sure of finding at least one left-handed individual in the sample? [4 marks] 3. A continuous random variable X has probability density function (pdl) f(x) = 1A x3 for 0 S x S 2, and 0 otherwise. a) Explain why f(x) can serve as a pdf. [2 marks] b) Find the mean and mode of the distribution. [3 marks] c) Determine the cumulative distribution function F(x). [2 marks] (1) Find var(X) = 0’2. [2 marks] e) Find the skewness of X, given by 3 EKX _ €(XD ] [4 marks] 0' f) If a sample of five observations is drawn at random from the distribution, find the probability that all the observations exceed 3/2. [3 marks] g) If a sample of fifty values is drawn at random from the distribution, estimate the probability that the sample mean exceeds 3/2. State any assumptions made. [4 marks] ©LSE 2009/ST102 Page 3 of 10 4. a) Consider the joint probability function of two random variables X and Y as shown: X l 2 3 ii) iii) iv) V) vi) vii) Find the value of k. Describe the marginal distributions of X and Y respectively. Describe the conditional distribution of X given Y = 2. Find E(XY). Are X and Y independent? Give reasons for your answer. Find the covariance of X and Y. Determine p(|X - Y| Z 2). [1 mark] [2 marks] [2 marks] [3 marks] [2 marks] [2 marks] [2 marks] b) Suppose {Z]} (i = l, 2, ..., k) are independent identically distributed standardised normal variables: Z] ~ N(O,1) for i = 1, 2, ..., k. State the distributions of i) Z12 .. Z 12 11) Z 2 2 Z1 iii ) 222 1 k ' — Zr 1v) k 2:] k v) Z Z,- [:1 3 212 +222 Vi) — 2 2 2 223 +Z4 +Zs ©LSE 2009/ST102 Page 4 of 10 [6 marks] 5. a) Suppose the random variable Y has an exponential distribution: its probability density function is given by f(y) = ke'xy for y 2 0 and 0 otherwise, where 7» Z 0 is the parameter of the distribution. i) Show that the median and interquartile range are given respectively by (1/), and 9/x , where the constants a and B should be determined. [5 marks] ii) Let s and r be positive real numbers. Prove that Y has the memoryless property that p(Y > s + r| Y > r) = p(Y > s). [4 marks] iii) Show that Y has moment generating function )0 /(x_t) for t < k. [3 marks] b) A sports magazine hired a golf professional to drive four different brands of golf ball. Each ball was driven fifty times and the distance achieved recorded. The outcomes are represented in a boxplot produced by a statistical software package: 280.00 260.00 240.00 Distance 220.00 200.00 180.00 1.00 2.00 3.00 4.00 Ball Distances have been measured in yards and the four types of ball labelled 1, 2, 3, 4. Give at least three salient features of the data which will aid interpretation. [6 marks] State briefly how you might undertake a more formal analysis of this data set. [2 marks] ©LSE 2009/ST102 Page 5 of 10 Section B 6. (a) A random sample of size 100 produced the sample sums ZiXi = 309 and Zi X} = 2104. i. Find the method of moments estimates for the population mean and the population variance. [4 marks] ii. Compute the standard error for the mean estimate. [3 marks] iii. Construct an approximate 95% confidence interval for the population mean. [3 marks] (b) Comment on why the maximum likelihood estimator is in general preferred over a method of moment estimator. [2 marks] (c) Let {X 1, - -- ,Xn} be a random sample from the uniform distribution on the interval [—6, 0], where 6 > 0 is an unknown parameter. i. Find a method of moments estimator for 6. [3 marks] ii. Find the maximum likelihood estimator for 6. [5 marks] 7. (a) Let {X1, - ~ ,Xn} be a sample from a distribution with probability density function 1 3 2 >. —/\ x e‘ ’5 x > 0 = 2 ’ f(x, A) { 0 otherwise, where /\ > 0 is a unknown parameter. Find the maximum likelihood estimator for x\. [6 marks] (b) A poll used a survey of 500 students to learn how students feel about a statis— tical inference course. Responses are as follows: 195 students Course is too easy 227 students Course is about at the right level 78 students Course is too difficult i. Construct a distribution to represent a population such that the survey result above may be seen as a sample from the population. [2 marks] ii. Find the maximum likelihood estimates for the parameters in the popula- tion. ' [6 marks] iii. Derive the explicit formulas for the mean squared errors of the maximum likelihood estimators. [6 marks] ©LSE 2009/ST102 Page 6 of 10 8. (a) A random sample of size 20 from a normal population with mean u and variance 02 produced the sample sums Z, X,- = 40.56 and Z, X,2 : 160.15. i. Find a 95% confidence interval for U2. [4 marks] ii. Repeat (i) above with the additional information ,u = 2. [4 marks] iii. Comment on the differences of the two confidence intervals obtained. [2 marks] (b) The numbers of customers arriving at a cornershop in 100 intervals of length 1 minute are listed below No. ofcustomers 0 1 2 3 4 25 No. of intervals 13 20 34 21 6 6 Use the goodness-of—fit test to test the following hypotheses: i. the number of customers arriving in one—minute interval follows a Poisson distribution with mean 2.8, [5 marks] ii. the number of customers arriving in one—minute interval follows a Poisson distribution. [5 marks] (Hint. You may use the fact without the proof that the MLE for the mean of a Poisson distribution is the sample mean.) 9. (a) In a wine tasting event, 8 wine experts rated between 1 (worst) and 10 (best) two wines as follows i. Using the sign test to assess at the 5% significance level if Wine A is significantly better than Wine B based on the above scores. [6 marks] ii. Repeat (i) above using the Wilcoxon signed—rank test. [6 marks] iii. Which method between the above two tests do you prefer? Give your reasons. [2 marks] (b) Find the missing values A1, A2, A3, A4, A5 and A6 in the two—way ANOVA table below produced by MINITAB. For A5 and A6, use Statistical Tables by Murdoch and Barnes and select the appropriate answer from “p S 0.01”, “0.01 < p S 0.05”, “0.05 < p S 0.1” and “p > 0.1”. [6 marks] Two-way ANOVA: C1 versus C2, C3 Source DF SS MS F P C2 2 260. 93 130.47 A4 A5 C3 A1 A2 109.50 12 .807 A6 Error 8 68.40 A3 Total 14 767.33 ©LSE 2009/ST102 Page 7 of 10 10. To consider the effect of a marketing instrument as on the weekly sales volume 3/ of a certain product, the data over 20 weeks {(xi, y), 2' = 17 - - - ,20} are collected. The following quantities are calculated: 20 20 20 20 20 2x1- : 400, 2 y = 220, 2x? = 8800, inyi = 4700, E y? = 2620. i=1 i=1 i=1 i=1 i=1 Assume the linear regression model yi = 50 + 51931 + 5i: where 51, 52, - ~ - are independent and N(O, 02). (a) Find the least squares estimates for fig and fl], and write down the fitted regression model. [5 marks] (b) Compute the standard error for the least squares estimator for 51. [6 marks] (c) Perform a t—test for the null hypothesis H0 : 51 = 0 against H1 : 51 75 0 at the 1% significance level. [3 marks] (d) Perform a t-test for the null hypothesis H0 : 51 = 0.37 against H1 : 51 > 0.37 at the 10% significance level. [3 marks] (e) With x = 18, find a predictive interval which covers y with probability 0.95. [3 marks] ©LSE 2009/ST102 Page 8 of 10 ©LSE 2009/ST102 Appendix: Formula Sheet 1. Discrete Distributions Distribution Probability function Variance Binomial k!(::k)!7rk(1 — 7r)"‘k, k — 0,1, ,n n7r(1 — 7r) Geometric (1 — 7r)'“‘17r, (1 — 7r)/7r2 Negative binomial %WT(1_ 70k”, k = 7“,?“ +1,~ r/7r r(1— 7r)/7r2 Poisson i—Te”, k = 0,1,2,~-- /\ /\ - S N - S N _ . nS nS(N—S)(N—n) Hypergeometric (k>(n—k>/(n>’ k—0,1, ,n W W 2. Simple Linear Regression Model: 311' = (30 + [31% + Ei- LSESZ Bo = .7] — 51$: 31 = 22:1(952' — Wm — 13/ Z?=1($j — EV, and a2 221:1 $3 02 —02i Var(§0) = — var(,31) = COVCEO) ’31) : n n Z?=1($i _ 3—7?) Z?:1($i _ j)? i=1($i _ 3—7)? Estimator for the variance of 51-: 32 = ai—2 221:1(141- — 30 — film-)2. Regression AN OVA: Total SS 2 Elm—.132, Regression SS = Z5? Elm—:32, Residual SS = Z(yi—§O—lei)2. i=1 i=1 i=1 Squared regression correlation coefficients: R2 = Regression SS, Ridj : 1 _ (Residual SS)/(n — 2) Total SS - (Total SS)/(n — 1) For a given x, the expectation of y is ,u(x) = [30 + film. A (1 — a) confidence interval for ,u(x) is ELM " 902 }1/2 n 231:1(371' _ 9W 7 and a predictive interval which covers y with probability (1 — a) is My”. 71 Z?=1(37j _ Q2 30 + E133 i ta/2,n—2 3{ 304—313; i ta/2,n—28{1+ Page 9 of 10 3. One-way ANOVA 0 Total variation: 2:le ZZfiXz-j — X)2 0 Between-treatments variation: B = Eg=1nj(X.j — X )2 o Within-treatments variation: W = 2:21 22:1(Xij — X4)2 4. Two-way ANOVA 0 Total variation: 217:1 233:1(Xij — X)2 o Between—blocks (rows) variation: Brow = 022:1(Xi. — X )2 0 Between—treatments (columns) variation: Bcol = r 233:1(Xj — X )2 0 Residual (Error) variation: 22:1 233:1(Xij - Xi. -— X.j + X)2 ©LSE 2009/ST102 Page 10 of 10...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern