COMM 291 – Application of Statistics in Business (Jan–Apr 2010)
ASSIGNMENT 6 – SOLUTIONS
Question 1
a) Scatterplot and assumptions.
All four assumptions are reasonable; the scatterplot shows an oval shape.
•
Linearity – clear clustering around a straight line
•
Constant variance – similar scatter around the regression line all along the xaxis
•
Normality – no apparent skewness above or below the regression line
•
Independence – the 25 websites were selected randomly!
Students were asked to provide hand calculations and show their work in carrying out the regression. I
have provided the Excel output to make it easy to check their results.
SUMMARY’OUTPUT ’
Y’=’Design’Cost;’X’=’#’of’Pages
’
Regression Statistics
Multiple’R ’
0.599’
R’Square ’
0.359’
Adjusted’R’Sq ’
0.331’
Standard’Error ’
7705.626’
Observations’
25’
ANOVA’
df
SS
MS
F
Significance F
Regression’
1’
764851722.66’
764851722.66’
12.88’
0.002’
Residual’
23’
1365663621.90’
59376679.21’
Total’
24’
2130515344.56’
’ ’
’ ’
’ ’
Coefficients
Standard Error
t Stat
Pvalue
Lower 95%
Upper 95%
Intercept’
9701.33’
6055.23’
1.60’
0.123’
2824.86’
22227.52’
No.’of’Pages ’
1213.45’
338.10’
3.59’
0.002’
514.04’
1912.85’
View Full Documentb) Estimated regression equation:
y
ˆ
= 9701.33 + 1213.45x
The line is already plotted on the scatterplot above.
c) H
o
:
β
1
= 0
H
a
:
β
1
≠
0
Fstat = 12.88 OR tstat = 3.59; either one gives a Pvalue of 0.002)
There is strong evidence of linear relationship between design cost (y) and number of pages (x).
d) 95% confidence interval for the marginal cost of an additional page on a new site.
This question asks for a CI for the slope:
From Excel output: (514.04 , 1912.85)
By hand: t*(23 df) = 2.069 from the text and 2.068657 from Excel
If you use 2.069 the CI is: 1213.45 ± 2.069(338.10) = 1213.45 ± 699.53 or (513.92 , 1912.98)
The difference between the hand calculation and Excel is just due to roundoff error.
