Joel Avery December 2, 2008 BUAD 310 T-Th 10am Case Study 1 Minitab Analysis Question 1 PART A) Histogram of pagecost Pagecost is skewed to the right. Lower pagecost frequency (around 2000-3000) is higher than higher pagecost frequency. Histogram of circ Pagecost is skewed to the right. Lower pagecost frequency (around 2000) is higher than higher pagecost frequency. Histogram of percmale There is no overall skewness in this histogram. All seem more or less consistent. Histogram of percmale There is no overall skewness in this histogram. All seem more or less consistent. PART B) Scatterplot of pagecost vs circ Somewhat linear. Doesn't look extremely strong. Positive association with a couple of outliers. Correlations: pagecost, circ Pearson correlation of pagecost and circ = 0.806 Scatterplot of pagecost vs percmale There is no relationship whatsoever. No linearity, only random plots. This variable isn't significant in finding page costs. Correlations: pagecost, percmale Pearson correlation of pagecost and percmale = -0.066 Scatterplot of pagecost vs medianincome No relationship here either. No linearity at all. Not correlated either. Correlations: pagecost, medianincome Pearson correlation of pagecost and medianincome = -0.011

Question 2 Regression Analysis: pagecost versus circ, percmale, medianincome The regression equation is pagecost = - 8643 + 5.28 circ - 11.0 percmale + 1.22 medianincome Predictor Coef SE Coef T P Constant -8643 12291 -0.70 0.486 circ 5.2815 0.5304 9.96 0.000 percmale -11.00 77.20 -0.14 0.887 medianincome 1.2226 0.5355 2.28 0.027 S = 13129.3 R-Sq = 69.4% R-Sq(adj) = 67.3% Analysis of Variance Source DF SS MS F P Regression 3 17219203510 5739734503 33.30 0.000 Residual Error 44 7584600772 172377290 Total 47 24803804282 Residuals vs Fits for pagecost PART A) The residuals vs fits for pagecost plot shows a random distribution of points, demonstrating a good fit for the regression model. The F-statistic and p-value for the regression shows that the multiple regression model is useful at the 5% significance level. The R-sq (67.3%) also shows a somewhat strong relationship between the explanatory and response variables. PART B) pagecost = - 8643 + 5.28 circ - 11.0 percmale + 1.22 medianincome PART C) Based on the p-values, I would keep circulation and median income because of their small p-values. They are both significant at the 5% level.
