This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: lSE Summer 2009 examination ST102 Elementary Statistical Theory 2008/9 and 2007/8 syllabuses only Instructions to candidates
Time allowed: 3 hours Full marks may be obtained for complete answers to FIVE equations. Answer not more than THREE questions from Section A and not more than THREE
questions from Section B. Final mark will be capped at 100 in the event of more than
100 marks awarded. You are supplied with: Graph Paper
Murdoch & Barnes Statistical Tables (4th Edition) You may also use: A hand held calculator which however must not
be pre—programmed or able to display graphics,
text or algebraic equations. The make and type of
machine must be stated clearly on the front cover
of the answer book. ©LSE 2009/ST102 Page1 of 10 Section A l. a) Suppose A and B are events in a sample space S with p(A) > 0 and p(B) > 0.
i) State what it means for A and B to be mutually exclusive. [1 mark]
ii) State what it means for A and B to be independent. [1 mark] iii) . Show that A and B cannot be both mutually exclusive and independent.
[3 marks] iv) Prove that if A and B are independent, then AC and BC are independent,
where AC and BC are the events complementary to A and B respectively. [5 marks]
v) Give an example to show that if A and B are mutually exclusive, AC and BC
are not necessarily mutually exclusive. [2 marks] [You may assume that for any events X and Y, p(X U Y) = p(X) + p(Y)  p(X F) Y)] b) State Bayes’ Theorem (no proof required). [2 marks]
Rebekah is going to a party. There is a 40% chance that Matt will go. If Matt does not
go, there is an 80% chance that Rebekah will enjoy herself. If Matt goes, there is only a
10% chance that she will enjoy herself. i) What is the probability that Rebekah will enjoy the party? [2 marks] ii) Suppose you know that she enjoyed herself. What is the probability that Matt
was not present? [2 marks] iii) Suppose you know that she did not enjoy herself. What is the probability that
Matt was present? [2 marks] ©LSE 2009/ST102 Page 2 of 10 2. a) Suppose a random variable X has the binomial distribution with parameters n and p. Show that the moment generating function of X is given by m(t) = (q + pet)n where
q = 1 — p. Hence, or otherwise, ﬁnd the mean and variance of X in terms of n and p. [9 marks] b) You may assume that 15% of individuals in a large population are lefthanded. i) If a random sample of 40 individuals is taken, ﬁnd the probability that exactly
6 are lefthanded. [2 marks] ii) If a random sample of 400 individuals is taken, ﬁnd the probability that
exactly 60 are lefthanded by using a suitable approximation. Brieﬂy discuss
the appropriateness of the approximation. [5 marks] iii) What is the smallest possible size of a randomly chosen sample if we wish to
be 99% sure of ﬁnding at least one lefthanded individual in the sample? [4 marks]
3. A continuous random variable X has probability density function (pdl)
f(x) = 1A x3 for 0 S x S 2, and 0 otherwise.
a) Explain why f(x) can serve as a pdf. [2 marks]
b) Find the mean and mode of the distribution. [3 marks]
c) Determine the cumulative distribution function F(x). [2 marks]
(1) Find var(X) = 0’2. [2 marks]
e) Find the skewness of X, given by
3
EKX _ €(XD ] [4 marks] 0' f) If a sample of ﬁve observations is drawn at random from the distribution, ﬁnd the
probability that all the observations exceed 3/2. [3 marks] g) If a sample of ﬁfty values is drawn at random from the distribution, estimate the
probability that the sample mean exceeds 3/2. State any assumptions made. [4 marks] ©LSE 2009/ST102 Page 3 of 10 4. a) Consider the joint probability function of two random variables X and Y as shown: X
l 2 3 ii)
iii)
iv)
V)
vi) vii) Find the value of k. Describe the marginal distributions of X and Y respectively. Describe the conditional distribution of X given Y = 2. Find E(XY). Are X and Y independent? Give reasons for your answer. Find the covariance of X and Y. Determine p(X  Y Z 2). [1 mark]
[2 marks]
[2 marks]
[3 marks]
[2 marks]
[2 marks] [2 marks] b) Suppose {Z]} (i = l, 2, ..., k) are independent identically distributed standardised
normal variables: Z] ~ N(O,1) for i = 1, 2, ..., k. State the distributions of i) Z12
.. Z 12
11) Z 2 2
Z1
iii
) 222
1 k
' — Zr
1v) k 2:]
k
v) Z Z,
[:1
3 212 +222
Vi) — 2 2 2
223 +Z4 +Zs
©LSE 2009/ST102 Page 4 of 10 [6 marks] 5. a) Suppose the random variable Y has an exponential distribution: its probability density
function is given by f(y) = ke'xy for y 2 0 and 0 otherwise, where 7» Z 0 is the parameter of the distribution. i) Show that the median and interquartile range are given respectively by (1/),
and 9/x , where the constants a and B should be determined. [5 marks] ii) Let s and r be positive real numbers. Prove that Y has the memoryless
property that p(Y > s + r Y > r) = p(Y > s). [4 marks] iii) Show that Y has moment generating function )0
/(x_t) for t < k. [3 marks] b) A sports magazine hired a golf professional to drive four different brands of golf ball.
Each ball was driven ﬁfty times and the distance achieved recorded. The outcomes are
represented in a boxplot produced by a statistical software package: 280.00
260.00 240.00 Distance 220.00 200.00 180.00 1.00 2.00 3.00 4.00
Ball Distances have been measured in yards and the four types of ball labelled 1, 2, 3, 4. Give
at least three salient features of the data which will aid interpretation. [6 marks] State brieﬂy how you might undertake a more formal analysis of this data set. [2 marks] ©LSE 2009/ST102 Page 5 of 10 Section B 6. (a) A random sample of size 100 produced the sample sums ZiXi = 309 and Zi X} = 2104.
i. Find the method of moments estimates for the population mean and the
population variance. [4 marks]
ii. Compute the standard error for the mean estimate. [3 marks] iii. Construct an approximate 95% conﬁdence interval for the population mean.
[3 marks] (b) Comment on why the maximum likelihood estimator is in general preferred over a method of moment estimator. [2 marks] (c) Let {X 1,   ,Xn} be a random sample from the uniform distribution on the
interval [—6, 0], where 6 > 0 is an unknown parameter. i. Find a method of moments estimator for 6. [3 marks] ii. Find the maximum likelihood estimator for 6. [5 marks] 7. (a) Let {X1,  ~ ,Xn} be a sample from a distribution with probability density function 1 3 2 >. —/\ x e‘ ’5 x > 0 = 2 ’
f(x, A) { 0 otherwise, where /\ > 0 is a unknown parameter. Find the maximum likelihood estimator
for x\. [6 marks] (b) A poll used a survey of 500 students to learn how students feel about a statis—
tical inference course. Responses are as follows: 195 students Course is too easy
227 students Course is about at the right level
78 students Course is too difficult i. Construct a distribution to represent a population such that the survey
result above may be seen as a sample from the population. [2 marks] ii. Find the maximum likelihood estimates for the parameters in the popula
tion. ' [6 marks] iii. Derive the explicit formulas for the mean squared errors of the maximum
likelihood estimators. [6 marks] ©LSE 2009/ST102 Page 6 of 10 8. (a) A random sample of size 20 from a normal population with mean u and variance
02 produced the sample sums Z, X, = 40.56 and Z, X,2 : 160.15. i. Find a 95% conﬁdence interval for U2. [4 marks]
ii. Repeat (i) above with the additional information ,u = 2. [4 marks] iii. Comment on the differences of the two conﬁdence intervals obtained.
[2 marks] (b) The numbers of customers arriving at a cornershop in 100 intervals of length
1 minute are listed below No. ofcustomers 0 1 2 3 4 25
No. of intervals 13 20 34 21 6 6 Use the goodnessof—ﬁt test to test the following hypotheses: i. the number of customers arriving in one—minute interval follows a Poisson
distribution with mean 2.8, [5 marks] ii. the number of customers arriving in one—minute interval follows a Poisson distribution. [5 marks] (Hint. You may use the fact without the proof that the MLE for the mean of
a Poisson distribution is the sample mean.) 9. (a) In a wine tasting event, 8 wine experts rated between 1 (worst) and 10 (best)
two wines as follows i. Using the sign test to assess at the 5% signiﬁcance level if Wine A is
signiﬁcantly better than Wine B based on the above scores. [6 marks] ii. Repeat (i) above using the Wilcoxon signed—rank test. [6 marks] iii. Which method between the above two tests do you prefer? Give your
reasons. [2 marks] (b) Find the missing values A1, A2, A3, A4, A5 and A6 in the two—way ANOVA
table below produced by MINITAB. For A5 and A6, use Statistical Tables
by Murdoch and Barnes and select the appropriate answer from “p S 0.01”, “0.01 < p S 0.05”, “0.05 < p S 0.1” and “p > 0.1”. [6 marks] Twoway ANOVA: C1 versus C2, C3 Source DF SS MS F P
C2 2 260. 93 130.47 A4 A5
C3 A1 A2 109.50 12 .807 A6
Error 8 68.40 A3 Total 14 767.33 ©LSE 2009/ST102 Page 7 of 10 10. To consider the effect of a marketing instrument as on the weekly sales volume 3/ of a
certain product, the data over 20 weeks {(xi, y), 2' = 17    ,20} are collected. The
following quantities are calculated: 20 20 20 20 20
2x1 : 400, 2 y = 220, 2x? = 8800, inyi = 4700, E y? = 2620.
i=1 i=1 i=1 i=1 i=1 Assume the linear regression model yi = 50 + 51931 + 5i:
where 51, 52,  ~  are independent and N(O, 02). (a) Find the least squares estimates for ﬁg and ﬂ], and write down the ﬁtted
regression model. [5 marks]
(b) Compute the standard error for the least squares estimator for 51. [6 marks] (c) Perform a t—test for the null hypothesis H0 : 51 = 0 against H1 : 51 75 0 at the 1% signiﬁcance level. [3 marks]
(d) Perform a ttest for the null hypothesis H0 : 51 = 0.37 against H1 : 51 > 0.37
at the 10% signiﬁcance level. [3 marks] (e) With x = 18, ﬁnd a predictive interval which covers y with probability 0.95.
[3 marks] ©LSE 2009/ST102 Page 8 of 10 ©LSE 2009/ST102 Appendix: Formula Sheet 1. Discrete Distributions Distribution Probability function Variance
Binomial k!(::k)!7rk(1 — 7r)"‘k, k — 0,1, ,n n7r(1 — 7r)
Geometric (1 — 7r)'“‘17r, (1 — 7r)/7r2
Negative binomial %WT(1_ 70k”, k = 7“,?“ +1,~ r/7r r(1— 7r)/7r2
Poisson i—Te”, k = 0,1,2,~ /\ /\  S N  S N _ . nS nS(N—S)(N—n)
Hypergeometric (k>(n—k>/(n>’ k—0,1, ,n W W 2. Simple Linear Regression Model: 311' = (30 + [31% + Ei
LSESZ Bo = .7] — 51$: 31 = 22:1(952' — Wm — 13/ Z?=1($j — EV, and
a2 221:1 $3 02 —02i Var(§0) = — var(,31) = COVCEO) ’31) : n n Z?=1($i _ 3—7?) Z?:1($i _ j)? i=1($i _ 3—7)? Estimator for the variance of 51: 32 = ai—2 221:1(141 — 30 — film)2.
Regression AN OVA: Total SS 2 Elm—.132, Regression SS = Z5? Elm—:32, Residual SS = Z(yi—§O—lei)2.
i=1 i=1 i=1 Squared regression correlation coefﬁcients: R2 = Regression SS, Ridj : 1 _ (Residual SS)/(n — 2)
Total SS  (Total SS)/(n — 1) For a given x, the expectation of y is ,u(x) = [30 + ﬁlm. A (1 — a) conﬁdence
interval for ,u(x) is ELM " 902 }1/2
n 231:1(371' _ 9W 7 and a predictive interval which covers y with probability (1 — a) is My”.
71 Z?=1(37j _ Q2 30 + E133 i ta/2,n—2 3{ 304—313; i ta/2,n—28{1+ Page 9 of 10 3. Oneway ANOVA 0 Total variation: 2:le ZZﬁXzj — X)2 0 Betweentreatments variation: B = Eg=1nj(X.j — X )2 o Withintreatments variation: W = 2:21 22:1(Xij — X4)2
4. Twoway ANOVA 0 Total variation: 217:1 233:1(Xij — X)2
o Between—blocks (rows) variation: Brow = 022:1(Xi. — X )2
0 Between—treatments (columns) variation: Bcol = r 233:1(Xj — X )2 0 Residual (Error) variation: 22:1 233:1(Xij  Xi. — X.j + X)2 ©LSE 2009/ST102 Page 10 of 10 ...
View
Full Document
 Fall '15
 Normal Distribution, Probability theory, Rebekah

Click to edit the document details