This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Student Name: Please Print
Student Number:
M
Section: 001 002 003
Instructor: M. Yalovsky P. D’Antoni D. Hart
M/W 1:05 —2:25 T/Th 2:353:55 M/W 10:0511:25
Section: 004 005 006
Instructor: D. Hart D. Zhang D. Zhang
M/W 2:35—3:55 T/Th 8:359:55 T/Th 10:0511:25
McGill University
Desautels Faculty of Management
FINAL EXAMINATION
December 9th, 2009
2:00PM —— 5:00 PM
Business Statistics
MGCR 271
Examiners: P. D’Antoni, D. Hart, M. Yalovsky, D. Zhang
INSTRUCTIONS: MM 1. Please write your Name and Student Number on the exam paper.
Circle the section Number in which you are registered. This is a closed book examination. Calculators that store text will not
be permitted. Translation Dictionaries are permitted NQIAPP’NT‘ Note: Each student is permitted to bring with them a crib sheet
(81/2xl ltwo sided, either written or printed) r—Ar—Aom
Ho. . Statistical tables are provided.
The allotted marks for each question are indicated to the left of the question. All examinations, tables and crib sheets must be handed in upon completion of the exam.
Total Pages for this examination: 17 (including the cover sheet) Please answer each question in the space allotted following the question. Use the back of the
pages for rough work. Best of Success (7) Question 1 The new Director of the Human Resources Department of a large ﬁrm wishes to estimate the
average number of working days lost to absenteeism per year for its group of professional
employees. She has been told that in similar studies carried on in the past the standard deviations this ﬁrm follows a Normal distribution. a) Determine the number of professional employee’s ﬁles that should be sampled in order to
obtain a 96% conﬁdence interval for the mean which will have a margin of error of no
more than two days. ' b) Suppose the sample size determined in part a) above had a mean of 6.30 days and a
standard deviation of 4.57 days. Determine the 95% conﬁdence interval for the mean. (8) Question 2 Jupiter Media used a survey to determine how people spend their free time. Watching television
was the most popular activity selected by both men and women. In a sample of 800 men, 248
selected watching television as their most popular leisure time activity. In a sample of 600
women, 156 selected watching television as their most popular leisure time activity. a) Establish a hypothesis that can be used to test for a difference between the proportion of
men and the proportion of women who selected watching television as their most popular leisure time activity. Conduct the hypothesis test and compute the pvalue. State your
conclusion at 5% level of signiﬁcance. b) Determine a 95% conﬁdence interval for the difference between the population
proportions. (3) Question 3 Containers of coffee on a production line are to be ﬁlled with 16 oz of coffee..The quality control
staff of the company has set up a system to monitor the mean content of all cans that are ﬁlled on
this production line. Based on past experience it is known that the amount of coffee per can ﬁlled
on the production line follows a Normal distribution with standard deviation 0.1 02. Every hour a sample of 9 cans is selected and their mean content is measured. Overﬁll and underﬁll of cans is
problematic. ' a) b) You have been asked to establish a hypothesis testing procedure by which you can
perform automated hourly tests to determine if the machine performance is satisfactory.
Your test will be based on the mean of these 9 hourly observations and you are asked to
employ a 5% level of signiﬁcance for each test. Clearly outline the procedure that you
would develop based on the averages of each hourly sample of 9 independent cans. If the true mean content during a particular period has shiﬁed to 16.1 oz, determine the probability that the test developed in a) will correctly detect this deviation from the target
value of 16 oz. Question 4
(7) Consider a random variable X that takes on one of two values 0 and 1 such that
P(X=1) = p and P(X=0) = 1—p.
Suppose we wish to test the hypothesis Hoip:%
Ha:p=% We take two independent observations X1 and X 2 of this random variable and set the following decision rule. Do not reject Ho ifeither X1 + X2 = 0 g + X2 = 1, otherwise reject Ho. a) Determine the probability of meg] error
b) Determine the power of the test. (6) Question 5 People at risk of sudden cardiac death can often be identiﬁed through the change in a signal
averaged electro cardiogram before and after having undertaken prescribed activities. The current
method employed is about 80% accurate. Hoping to improve the accuracy of prediction, a new
testing procedure has recently been developed. The new method is tested on 50 people and
correct results were obtained for 46 of the 50 patients. You have been asked to test whether these
results provide convincing evidence that the new method is more accurate. Establish the appropriate hypothesis and carry out the statistical test. Your conclusion should
provide information on the Pvalue of the test. Question 6 An undergraduate student wishes to compare the cost of one and two bedroom apartments in the
area of the McGill campus. She collects data for a simple random sample of 10 advertisements of
each type and records the advertised rent. She knows that given tight rental market in the area,
renters typically pay the advertised price to secure the apartments. Historically, rents for one and
two bedroom apartments in the area have been approximately normally distributed. Listed below
are the rents advertised for one and two bedroom apartments (in $ per month). (10) Some summary statistics are given below) _
611507 531.50
Standard deviation 88.26 83.33 a) Determine a 95% conﬁdence interval for the additional cost of a second bedroom. b) Having takena course in Statistics, the student wishes to determine if two bedroom
apartments rent for signiﬁcantly more than one bedroom apartments. Carry out the
appropriate test of hypothesis; provide your conclusion and the p—value associated with the
test. c) Can you conclude that every one bedroom apartment costs less than every two bedroom
apartment? d) Which do you think is the more useful method of analysis for a student planning to rent an
apartment in the University area —— the conﬁdence interval or a test of signiﬁcance? Justify
your answer. (Additional working space is also provided on the next page.) Question 7
(7) a) Is there evidence of a signiﬁcant difference amon
shopping day? Conduct an appropriate hypothesi
b) What factors not given might be relevant to this study? g the age groups with respect to major
s test to answer the question. Use a=5%. Question 8
(13) The McGill Health Club has decided to initiate a drive to recruit new members. The managers believe that there might exist a relationship between the amount of advertising spent during a one
month period and the number of new members recruited during that one month period. The data
below presents a partial history of data gathered over the last 20 months. Advertisin Amount in $ ——_
__—
_—_ To what extent can the number of newly recruited members be predicted by the amount of dollars spent on advertising? We have carried out some statistical analysis of the data and the partial results
are as follows: ' ' 1000
Advertising $ Regression Analysis: New Members versus Advertising 3 The regression equation is
New Members = Predictor Coef SE Coef T P
Constant 44.597 3.064
Advertising 3 0.010580 0.003607 S = RSq = 32.3% R—Sq(adj) = 28.6%
Analysis of Variance Source DF SS MS F P
Regression 1 647.67 647.67 8.60
Residual Error 18 1355.28 Total 19 2002.95 ‘ Residual Plots for New Menﬂiers Normal Probability Plo‘tf__ I I r , VerSus Fits 99 g ,
V , 3 Q , “
90 M mgr. I I 20
z ; ,r — .
z x 50 E; ' 3% 10
E : '
§ 3 0
1'“ “ P , , r g ; r , , 10 '
20 10 ,0 10 I 20 45 50 / 55 .60 65
Residual Fitted Value, “ r ' Versus Order 8 1012 14 16 1820
Observation Order IO Summarize the relationships. Might there be some problems with the data? Determine the equation of the least squares regression line of new members on dollars
expended on advertising. Insert the estimated regression line on the plot. Would you conclude that there is a signiﬁcant linear relationship between the number of
new members and the amount spent on advertising? Determine a 95% CI for the slope. If your advertising budget had $500 more how many additional new members would you
expect that the Health Club could recruit? Provide a 95% conﬁdence interval for your
estimate. 7 Determine the coefficient of determination and explain its signiﬁcance. ll (12) 1 Question 9 Alumni donations are an important source of revenue for colleges and universities. If
administrators could determine the factors that could lead to increases in the percentage of alumni who make donations, they might be able to implement policies that could lead to increased revenues. Research shows that students who are more satisﬁed with their contact with teachers
are more likely to graduate. As a result, one might suspect that smaller class sizes and lower
student—faculty ratios (SFRatio) might lead to a higher percentage of satisﬁed graduates, which in
turn might lead to increases in the percentage of alumni who make a donation. A regression
analysis is done to investigate this issue. The following multiple regression model is used: GivingRate = [30 + [31 (Public) + [32 (GradRate) + [33 (ClassSize) + [34(SFRatio) + .04, with the usual assumptions on the error term a. In‘the above model, the “GivingRate” variable
represents the percentage of alumni who made a donation to the university; the “Public” variable
takes on the value 1 for public universities and 0 for private universities; the “GradRate” variable
indicates the percentage of students who initially enrolled at the university and graduated; the ‘
“ClassSize” variable shows the percentage of classes offered with fewer than 20 students; the
“SFRatio” variable denotes the student /faculty ratio or the number of students enrolled divided
by the total number of faculty in the University. A partial statistical output is given below:
SUMMARY OUTPUT Regression Statistics Multiple R 0816431673
R Square 0.666560677
Adjusted R Square 0.610987457
Standard Error 8392626673
Observations 29
ANOVA
’ df SS MS F Signiﬁcance F M
Regressmn
Residual Total
—————————____________________________________ —————————————————_—_______—_—___ Coefﬁcients Standard Error t Stat Pvalue Lower 95% Upper 95%
Intercept 51.27966052 27.74872385 ~108.5502114 5.99089032
Public 1.284410374 3.448307546 5.83254656 8.4013673]
GradRate 0.985063232 0.23661 1042 0.496722047 1.47340442
ClassSize 0067798364 021310043 0372019304 050761603
SFRatio 042158072 068627319] 1 .837978961 0.99481752 M 12 (a) Write down the equation of the estimated regression model. (b) A private university has the following data values: “GradRate” = 92, “ClassSize” = 69,
“SF Ratio” = 7. Predict the “GivingRate” based on the regression model. (c) Should “SFRatio” be included in the model given the other variables are included in the
model? Provide a statistical justiﬁcation for youranswer. Use a=5%. (d) Test for overall signiﬁcance of the regression model. Clearly state the hypothesis and
conclusion. (e) A second analysis using “SFRatio” as the only explanatory variable is conducted. A partial
output from the statistical analysis is reported below. For the sake of parsimony you wish your
model to have as few explanatory variables as possible. Can the model containing the “SFRatio”
as the only explanatory variable predict the “GivingRate”? Fully explain. SUMMARY OUTPUT Regression Statistics
Multiple R 0644029506 R Square 0.414774004
Adjusted R Square 0.393098967
Standard Error 10.48274547 Observations 29
ANOVA
df SS MS F Signzﬁcarzce F W Regression
Residual Total 5069.793103
———————________________ M Coeﬂicients Standard Error t Stat P—value Lower 95 % Upper 95% Intercept 51.18915192 5.161866032 9.916792 1.698E10 40.59787779 61.78042604
SFRatio l.87186854 0.42790711 4.374474 0.0001633 2.749861392 —0.99387568 l3 Question 10 For each of the following situations, indicate whether the statement is correct or incorrect. Justify
your response. (10) a) b) C)
d) e) A researcher wishing to test equality of two population’s means formulates the hypothesis
as H0 : Y1 = Y2 versus two sided alternative hypothesis Ha : Y1 7E Y2 . A two sample 2 test statistic for testing the difference in proportions gave a P—value of 0.96.
From this you can reject the null hypothesis of equality of proportions with 95%
conﬁdence. If the Pvalue for a test of signiﬁcance is 0.35, we can conclude that the null hypothesis has
a 35% chance of being true. In a multiple regression with a sample size of 50 and 4 explanatory variables, the test of the
null hypothesis H0 :32 = 0 uses a t test statistic that follows a t distribution with 45 degrees
of freedom under the assumption that the null hypothesis is true. One of the assumptions for multiple regressions is that each of the explanatory variables
follows a normal distribution.  14 15 (12) Question 11 E] Irondeﬁciency anemia is the most common form of malnutrition in developing countries,
affecting about 50% of children and women and 25% of men. Iron pots for cooking foods had
traditionally been used in many of these countries, but they have been largely replaced by
aluminum pots, which are cheaper and lighter. Some research has suggested that food cooked in
iron pots will contain more iron than food cooked in other types of pots. One study designed to
investigate this issue compared the iron content of some Ethiopian foods cooked in aluminum,
clay, and iron pots. One of the foods was yesiga wet’, beef cut into small pieces and prepared with
several Ethiopian spices. The iron content of four samples of yesiga wet’ cooked in each of the
three types of pots is given below. The units are milligrams of iron per 100 grams of cooked food. , 1A partial computer output is given below: ANOVA: Single Factor SUMMARY
Groups Count Sum Average Variance
Aluminum 4 8.23, 2.0575 0.241492
Clay 4 8.71 2.1775 0.386025
Iron ' 4 18.72 4.68 0.394733
ANOVA
Source 0 Variation SS d MS F Pvalue F crit
Between Groups
Within Groups
Total 20.606 (a) Based on the above data test whether the average iron content is the same for all three
types of pots at 5% level of signiﬁcance. Carry out the analysis of variance. Report the F
statistic with its degrees of freedom and Pvalue. State your conclusion. (b) What assumptions are made in order to carry out the analysis of variance? (0) Conduct a multiple comparisons procedure to determine which pairs of means differ
signiﬁcantly. Summarize you results. 16 l7 ...
View
Full
Document
 Summer '10
 Vaidyanathan
 Statistics, Linear Regression, Normal Distribution, Regression Analysis, Statistical hypothesis testing, Errors and residuals in statistics

Click to edit the document details