This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: , ,, mm? x W. W», ‘24,, Mt” Oct 2010
“if Midterm Examination Advanced Business Statistics
MGSC 272
Oct 22, 9:00  11:00 am.
Examiner: Prof. Brian Smith [Student Name: i McGilllD:—L I i i i L j INSTRUCTIONS:
o This is a CLOSED BOOK examination. 0 Only one hand—written or typed double—sided CRlB SHEETS permitted.  SPACE lS PROVIDED on the examination to answer all questions. 0 You are permitted translation or regular dictionaries. 0 Regular calculators are permitted. However calculators that can store text are not permitted.
0 The marks for each question appear next to the question number. 0 The exam consists of a total of 25 pages, including this cover page and 2 pages for rough work at the
end of the exam. 0 This examination paper MUST BE RETURNED For Marker Only Part 2 consists of four questions for a total of 70 marks. MGSC 272 Page U23 WW“. Part 1 g30 marks! 1. A maximum likelihood estimate of a population mean u for a normally distributed variable X with a
standard deviation of 3 is to be calculated based on a sample of four values x1, x2, x3, and X4. Consider the following statements: 4 e1 2
I. The likelihood function is given by H——1—~—e 4 3 5(3) 4(1)?” 2
2\ 3 II. The likelihood function is given by 1 4
6
32471?2 [:1
I 4 “I‘l‘g‘i(xi’ﬂ)g e 1:!
V27z(3) III. The likelihood function is given by [ Which of the above statements are true? A. Ionly B. I and II only C. I, II, and III D. II and 111 only E. None of the above answers is correct Answer: C 2. A supermarket manager is investigating the percentage of customers who use discount coupons for
their purchases. She selects a random sample of 25 customers every day for a week and records the
number of customers who use coupons as follows: Day Sun +Mon Tues Wed Thur Fri Fat
Number who use coupons 8 10 L11 9 12 14 13 J The maximum likelihood estimate for the percentage of customers who use coupons is: Auswet‘: 7?[{25*?‘3 2: {$.44 MGSC 272 r Pagezi’23 , «NW/w 3. An accountant wants to estimate the proportion, p, of the population that files its taxes using a
software package. He selects three samples of ten people and obtains the following results: in the
first sample 5 of the 10 say they use a software package; in the second sample 7 people say they will
use software, and in the third sample 4 say they will use it. Find the value of the likelihood function £10 a {to} .3 “‘10
, (.4)‘ (.6) ~ ,. .(.4}'(.6)‘1
3 if j K4 \\
learns? = .00213736 Answer: 0802 l 3 1736 4. Maximum Likelihood Estimates are often preferred over Least Squares Estimates because; I they explicitly use the probability distribution being estimated.
II they give more accurate estimates in simple linear regression.
111 they are suitable for estimating parameters of highly volatile variables. Which of the above statements are true: A. Ionly B. I and 111 only
C. II and III only
D. III only E. All of the above statements are true Airs wer: 8 MGSC 272 Page3f23 A, 4,,;,__.._..uu.»wv W; 5. A statistician wants to find a maximum likelihood estimate for the mean of a normal distribution
with a standard deviation of 8. Being lazy, he only obtains two values of the normal distribution, namely 84 and 72. The value of the likelihood functions when u = 80 is: A. .00043
B. .00266
C. .00071
D. 00133 E. None of the above Answer: D 6. An economist is studying a lognormal distribution X. The variable ln(X) has a normal distribution with
a mean of 3 and a standard deviation of 0.4. The lognormal distribution density function for variable X is given by: '
1 _[lnX3]z = e 2 0.4
x/ 27m (0.4)
The mean of X is and the standard deviation is ‘ l J
‘3+—({l.4r E(X) : a 2 J’ 2 e3“. : 23.7584 éxsmxﬁ } VAR(X) 2: 6i [6042 nl] : 66’16(.173511§: 82.1449 at’X) 2: #21449 = 9.06 f<BG> = : .{3332456'502W 213201 I " g t, \f 2%{3GXQ4} MGSC 272 Page4/23 8. Consider the following statements concerning lognormal distributions: I. Future values of stock options follow a lognormal distribution. Future values are hard to predict
because the volatility increases with time. 11. The future value of an option follows a lognormal distribution. A 95% confidence interval for the
future value one year from today will be more precise than a 95% confidence interval for the future
value in 6 months time. III. For a fixed value of p, the larger the value of o the more positively skewed the lognormal
distribution will be. Which of the above statements are true: A. Ionly B. 11 and 111 only C. I and 111 only D. I, II, and III E. None of the above statements are true Answer: C 9. An Initial stock price is $50. The expected return is 7% per annum, and the volatility is estimated to be 15% per annum. The mean stock price in 9 months is estimated to be and the standard
deviation is estimated to be Arisweri $52.25 and 1.94425 int’SO} + [.077 — ’x. 3 ,; ><t175 2 3.956 WWJ u 3 . 97% Mean, stock price :2 e m 3.25 Std. Dev. of log price = aft: : dist/0.7:“ : 8. m9
Std. Dev. of price :2 em” 3 1.1387 MGSC 272 Pag65f23 10. Refer to the data from Question 9. A 90% confidence for the value of the stock after 9 months is: 3.956 a: i .645(. l 299;;
3.956i2137 3.7423 3 la 5,. 3 4.1697
42.19 s S... g 64.70 11. Consider the following statements regarding simple linear regression. I. The model is appropriate only if the assumption of heteroscedasticity is satisfied.
11. Parameter estimates are reliable only if the value of the dependent variable lies Within the range of
observed data. 111. Prediction intervals for an individual value of the Y variable are wider when the given X value is
further from the mean X value. Which of the above statements are true? A. lonly B. H and HI only C. 11 only D. Ill only E. None of the above statements are true. Ariswer: {:3 MGSC 272 Page6f23 12. For a multiple regression model, consider the following statements regarding multicollinearity: I. If two independent variables are highly correlated with the dependent variable with correlations in
excess of 0.90, the model will contain multicollinearity.
II. Regression models containing multicollinearity will give reliable confidence interval estimates of
the mean value of the dependent variable for given values of the independent variables.
III. In models involving significant interaction between two variables, neither of the two variables involved in the interaction can be dropped from the model, even if the associated p—value is greater
than 0.05. Which of the above statements are true: A. I and 11 only B. 11 and 111 only
C. 11 only
D. 111 only
E. All of the above statements are true. Answer: {3 The following information pertains to the next three Questions: A contractor wishes to determine a relatiOnship between house size (Y) and the variables family income
(X1), family size (X2), and education of the head of household (X3). House size is measured in hundreds of
square feet, income is measured in thousands of dollars, and education is measured in years. A partial
computer output is shown below. SUMMARY OUTPUT Regression Statistics
Multiple R R Square Adjusted R Square
Standard Error
Observations ANOVA df SS MS Signéf F
Regression E: 3605 .77 901.44 0.0001 Residual 26.98
Total 49 4820.00 Coeﬂ. 51. Error I Stat
Intercept ~— 1.63 5.81 m 0.281
Family Income 0.45 0.1 1 3.955
Family Size 4.26 0.81 5.286
Education « 0.65 0.43  1.509 MGSC 272 Page7/23 l3. Am: 731795: 95:? 4 :égrg
Vie'jéis :5? 14. What minimum annual income would an individual with a family size of 4 and 16 years of education
need to attain a predicted 10,000 square foot home? Ans: $2.1 L089 15. One individual in the sample had an annual income of $100,000, a family size of 10, and an education of 16 years. This individual owned a home with an area of 7,000 square feet. What is the
residual (in hundreds of square feet) for this data point? Aas: «5.40 MGSC 272 Page8/23 QUESTION 1 $18 marks) A human resources manager wants to investigate factors affecting salary. She has conducted a study of 100
public service employees and has collected the following data: Salary (Y): in Canadian dollars Education (X1): in years of education
Experience (X2): in years of experience
Gender (X 3 ): == 0 for female employees, 1 for male employees An extract of the data is shown below: MGSC 272 g aégésﬁér_mpwwss 5,,r??ﬁééérmw, : ‘1§a§337” WWE_H,8§A?BEH r
f_rhy?zgﬁ;__rrsam is WS€€4§E,\H,W ﬁrs1§i?sdfsssﬂ_w
Wjdéiiaé,m ‘ m‘mn PART 2 (70 marks) 4,}, Page9/23 ‘.,,.4,,,,,,,,.W_~_‘.~~,4 “W” “V” V “w V x 33 m“; Consider the following regression models: 2 ' ?
é3§2 1?.35 ﬁsﬁﬁﬁ REQFESSlOﬁ gnalysis: Salary VEfSUS Education, Experience, GEHUEF Ihe regression equatinn is
Salary =  5338 + 2113 Eduaatian + éééé Experience + 1851 Sender gredictsz ﬁne: 82 Csef
ﬁsnstant 8335 15%33
Education 2113 EGES axperieaea 4§§§.3 31?i2 ?
$35512: 1551 3333 $22» i
5/ A 7 Regression Analysis: Salary versus Experience Eradictor 695% SE Ceef E g
ﬁgnstanz 2§Z§3 é§§? 5,83 G,Q§5
Experianee 4153.1 2§6g§ léﬁgé %.QGQ Regression Analysis: Experience versus Gender the zegreasign equation is
Experience = 12.1 + 4eéﬁ éanﬁez M' 5685 SE Caef L
1L.8?B§ @.3422 34.34
é.§8§ 1.3?Q élSﬁ $23 CI) a; w. '33 L723 (:3! (I?! R. 53.3% Vt? MGSC 272 Page10f23 Interpret the coefficient of Gender in the first model above, Salary vs Gender. Based on this model would
you conclude that there is a significant difference in salaries of female and male employees? Show your
work. [5 points} Interpret the coefficient of Gender in the second model above, Salary vs Education, Experience and Gender.
Based on this model would you conclude that there is a significant difference in salaries of female and male
employees? Show your work. {5 points] 2% A union official recently made the statement “From these data we can see that women, on average, earn less
than men — this proves that there is discrimination against women in the public service sector.” 52W Do you agree with this statement? Circle one: YES NO\3[1 point} sew/V Use any information from the printout above to support your conclusion. [,4 points} MGSC 272 ' Pagel 1/23 QUESTION 2 512 marks! Refer to the data set and Minitab output in Question 1 above. The regression models of Salary on Education and Gender without and wit Regression Analysis: Salary versus Education, Gender Ehe regression equation is
Saiary = 31612 + Zﬁﬁé Educazisn + 22;?5 Gender §§€dictcr See: SE Scef T Sanacanc 31513 zscac 1.21 8.223
ﬁducacian 2§Q§ 1%?4 1.?4 Qtﬂaﬁ
gender 'ZQAB 582? 3_9§ 8.§GQ Regression Analysis: Salary versus Education, Gender, Education‘eender The segressicn equaticn is
Salary = §S?21  14ﬁ§ Educeticn — 3836§ gender + 3&1: Educani "£ri ?zedictcz €325 SE 536% I §
Sonataat Q8?Zl 3??33 2,62 Q‘Qlﬁ
Education 145§ Zééé —6.5§ 3,54%
Sender §3?5Q ESSQQ l‘§6 8‘353
Educaticn*§€nder 3311 328? 2,42 8,318 Based on the available information, which of the two models would you re
being as specific as possible. {4 points] 5 i c
F? £ ﬂ ,
' "a; 5;? 1’ a MGSC 2?} h interaction are shown below: commend? Explain your choice, PagelZK23 Referring to the model without interaction, predict salary for a male employee with 15 years of education. [2 points] M
J f1 :g E: Referring to the model with interaction, predict salary for a male employee with 15 years of educatiOn.
[2 points] alt
Q
s
33X.
("WW \ﬁ Specify the percentage error associated with selecting the wrong model. {4 points] “a,
, ax ; ’
M» M,
j
,, § MGSC 272 Page l 3/23 QUESTION 3 (25 marks) A home economist is conducting a preliminary study on the factors inﬂuencing household expenses
(Expenses in $1,000). The explanatory variables being considered for inclusion in the study are family
income (Income in $1,000), family size (F amsize), annual expenditure on gasoline (Gas, in $1,000), average
age of the parents (ParentAge), and whether or not the parents are divorced (1 = divorced, 0 = not divorced).
The preliminary study involves a random sample of 10 families as shown below. 31? i ‘ i L The following simple correlation matrix has been obtained using Minitab cerrelations: EXpEﬂSES, Income, FEMSlZE, Gas, Parenmge, Divorced Expenses Incame Eamsize $33 Fareneﬁge
Ineame ﬁ.1§é
Famsize @’§?S 6‘315
Qas $.35? —Q,?G% S,336
9a:eaeége Qt?§8 8.322 3.815 —Q,12§
Eivazeed —§lﬁéé —§.32§ 9.354 $.435 Based on the simple correlation matrix, discuss the problem of multicollinearity in a model with Expense as
the dependent variable and containing all of the independent variables. Specify two different pairs of
variables that should not be included in the model if one wishes to avoid multicollinearity. [3 points] Pagel4f23 MGSC 272 MMVxAW/mvw wwwwww, Consider the following regression models: Regression Analysis: Expenses versus Income The regression equation is
Expenaea = 5.2? + 3,523é Income Frediczor Saeﬁ SE 5355 I F
Qanszant 5.233 3.223 l.éé §,248
EQCQEE @‘Q2333 Q.G4§i§ §.é8 9.64? Source '5 Rigzesaian 1 8.18 1 $.18T1 $.23 $.éé?
Residual E2332 3 S‘EEEQ 9,32éﬁ Total 9 6 was RBQYBSSiOﬁ Analysis: EXPEﬁSES VEI’SUS Famsiza The regression equasian is
Expenaes = 4.8? + 3.35? Eemsize ?reﬂiata:
Constant 4.8?4ﬁ u..?
Famaize , , 3;' MGSC 272 Pagelﬁf‘23 Regression Analysis: Expenses versus lnceme, Famsize Ina regressive equation is
Expenses = 5.3: ~ §.§G?4 leecme + $.355 Eemsiza ?redieee§ ,: : SE 535%
Senazaae ' 2.5?i
Eneeme . ., . §,Q4131 Eamsize ' * $.1547
Q.?23184 RSq = %§.§% ﬁnelysis of variance
Seurce @E $3
Eegresaien 2 ’ 'i 1x5894
Residue} Error 3 'V ' §,52§§
Eetal % Calculate the coefficient of partial determination when F amsize is added to the model containing Income. Interpret this value. [4 points] «é?§§§§;grﬁ>m ,ﬁ
Perform a test of hypothesis at the 10% level of significance to determine whether it is worth adding amiii‘g: to the model already containing Income. [5 points] Pageié/EB The home economist now decides to investigate an alternative model by regressing Expenses on ParentAge
and Divorced. The model is shown below: Regression Aﬂ31¥5i51 Expenses VEfSLiS Parenmge, Divorced Ehe regressisﬂ equazisn is
Eagensas : 2.9% + StS?52 $arentkge  8.232 ﬁivazeed Ezedieze: $se§ SE 5922 I 3
Senszgnt 21§8§ 2.11% 2&5? 8,832
?arenz§§e ﬁ‘ﬂ?§1§ Q,§2144 §,§
ﬁivereed §*2323 §e3?§3 wk}? W
a?“ W‘» m E 33;; = 5M; aSqéadj: = ﬁaalysis aé Variance Source E2? 55 ’5 ’E‘
Regreasian 2 4i3398 2.i8§9 5’33 3.92?
Residual Sega: ? 2.é232 3.3%5? fatal 9 ﬁ 38 0 Based on the above output, and assuming a 5% level of significance, should both variables be retained in the
model? Explain your answer. [4 points] E ‘ w
i? // g ﬁ§?§ ,« e
w egg2,, =§aew» M wﬂﬁéﬂ
: y”; e ( M x w» 9,
Y“ «J “RX”; a M
.9 g ﬂags»? $ﬁ§’%“§4§
3. a”
2 ﬂ“? :: “fr Qnge
E eff W 3r”"~£ 3 M V x 23.323 e
212% $53”? “3" a“
«7 x éég j?)
e ﬁéﬁfﬁ {:6
a; MGSC 272 Page i 7/23 is as follows: The complete model with all five independent variables versus income, Famsize, Regression Analysis: Expenses (“I 'oa is
3‘2?? Eamsize + l.'3 was + G,§455 §azene§ge B3 §§edicaor Ssei SE See: P
eaaseaa: 1.?39 4.733 a,41 a.r34
ﬁnesse s.aa?§3 a,e51se —s,15 arses
Eamsize a,2?34 6‘27 3 its: Star:
Gas 1,225 1.231 lies a.sad
eagencage a.aaes? eiaaza4 1.a5 3.349
Qixarsed a,saas sisaaé 1,4a 8&222 Interpret the coefficients of ParentAge and Divorced in the above model. [4 points]
L ‘ ‘ p 5 rgﬁﬁgi hild. Construct a 95% confidence interval for the change in expenses (Expenses) when a family has another c
a: /' é 9"} V {We interpret the interval. [5 points] MGSC 2'22 QUESTION 4 {15 marks: Refer to the scenario and data of Question 1. The Best Subsets regression of Expenses on all of the independent variables is shown below: Best 39%5825 REQI‘ESSiQﬁI Expenses VE¥SUS income, 3531115129, F a r a n 2; ‘ A gallows m 2 a g gets 8Sq R—3q{adj3 Cp 3 e a a e l £2.S 53,8 %.5 Qtﬁéégé E : 48.§ 38.8 3.5 §t5?$39 K 2 $5.2 53,@ 2.1 3’53311 X X
2 §&,4 54.2 2,2 3‘58886 X K
3 $144 53.: 3,8 Qtﬁé§53 X X X
3 ?Gt? Beta 3.1 3.5?6ﬁl X X K
4 was 53.? 433 sass:
4 $1.8 $3,? 5,3 8.52231 X X E X
5 3?,2 48.? é.§ 6.82285 X X X X X Explain why Mallow’s C has its smallest value for the model containing only the variables ParentAge
whereas the maximum R a is achieved for the model that also includes F amsize, Gas, and Divorced?
[2 points] MGSC 272 Page 1 92123 Notice that the value of the standard error of the estimate, 5 for the model with independent variables
F amsize, Gas, ParentAge, and Divorced is blacked out in the Best Subsets printout above. Given that SSyy = 6.8, compute the missing value of s. [5 points] The home economist wants to explore the relationship between expenses, gas and parent age. She has
printed out the following data using Minitab: Descriptive Statistics: Expenses, Gas, Parentage variabie ﬁean SE 33am variaﬁce ﬁinimum/ Raximum
Expenses étaﬁa $.275 ﬁtESé StESQ $.930
Gas 1,5§?G ﬁ‘§821 Q.G§?é 1.665ﬂ li§eaﬁ
Farentage 51.£8 2.?1 84,31 3%.ﬁﬂ £4.QQ Regression Analysis: Expenses versus Gas, Parenthge Ehe regzeasian equation is
Expenses = 1.?Q + Qtﬁﬁﬁ Gas + [email protected]?65 Earaﬁtkge Eradiczaz 59,5 3E Csef I §
Canszang Z.§§G it??? ~
ﬁ 33%6 3‘33éﬁ
97§§§ Gi§2§2§ J*¢ Source $5 33
Reg3253103 2 i.$i§3
Residual Ergo: ? 2,3362
fatal Q éiaﬁﬁé MGSC 2’52 PageZO/B m, Wm“1,,.,M.W.M,_.‘W/ y r Determine the standardized regression coefficient (beta coefficient) of Gas in this multiple regression
model. [3 points] 9*: i , Computer output for Expenses and ParentAge is as follows: [38 5131’ ipﬁve Statistics: EXQEFISES, Parentgge %ean SE ﬁean Variance
5,8353 3.2?5 {3.3%
EtarenaAge 51.53 231 3%.?1 REQYESSiOﬂ Analysis: EXFEHSES VEYSUS FareniAge The zegxeesien equaeion is
Expenses = 2.35 + 8,3134% eeeeneéage , gee: SE {reef T
{lament 2.949 L831 2,38
¥erentk~ge Q.§'§4E3 {2&233413 3.53 3 = aeswss 2—5:; = sass Obtain a 98% confidence interval for the mean expenses for a family when ParentA ge is equal to 40.
[5 points] Page2 1/23 ...
View
Full Document
 Spring '12
 smith

Click to edit the document details