SMART FIELD HOMEWORK FOR CHP 7

Course: BUS 603, Spring 2012
School: Rutgers
Word Count: 1690

1: Task A fashion student was interested in factors that predicted the salaries of catwalk models. She collected data from 231 models. For each model she asked them their salary per day on days when they were working (salary), their age (age), how many years they had worked as a model (years), and then got a panel of experts from modeling agencies to rate the attractiveness of each model as a percentage with 100%...

1: Task A fashion student was interested in factors that predicted the salaries of catwalk models. She collected data from 231 models. For each model she asked them their salary per day on days when they were working (salary), their age (age), how many years they had worked as a model (years), and then got a panel of experts from modeling agencies to rate the attractiveness of each model as a percentage with 100% being perfectly attractive(beauty). The data are in the file Supermodel.sav. Unfortunately, this fashion student bought some substandard statistics text and so doesnt know how to analyze her data. Can you help her out by conducting a multiple regression to see which variables predict a models salary? How valid is the regression model? Descriptive Statistics Salary per Day (? Attractiveness (%) Number of Years as a Model Age (Years) Mean 11.3385 75.9447 4.5854 18.0679 Std. Deviation 16.02644 6.77303 1.57865 2.42190 N 231 231 231 231 Model Summaryb Model R R Square Adjusted Std. Error Change R Square of the Durbin-Watson Statistics Estimate R Square F Change df1 df2 Sig. F Change 1 .429a .184 .173 14.57213 .184 Change 17.066 3 227 .000 a. Predictors: (Constant), beauty, years, age b. Dependent Variable: salary ANOVAa Model Sum of df Mean Square F Sig. Squares Regression 3 3623.988 Residual 48202.790 227 59074.754 17.066 .000b 212.347 Total 1 10871.964 230 a. Dependent Variable: salary b. Predictors: (Constant), beauty, years, age Adjusted states the shrinkage from the unadjusted value(0.184) pointing the model could not occur well. 2.057 We could use: Adjusted R2 =1-[(231-1/231-3-1)(231-2/231-3-2)(231+1/231)](1-0.184)=0.159 This is meaning these results are indicative that the model may not cross generalize well. The population used was 231 models and three predictors were observed which is suitable in observing medium to large effects. The 18.4% of the variance in salary per day, it is a suitable fit of the all data F (3227)=17.07,p<0.0001). The R2 The tolerance is below 0.2 which also indicates a serious problem in the collinearity of the model. This indicates that the age and years are almost identical meaning they measure almost the same thing. The reasoning for this is because as you age skin becomes wrinkled therefore making it harder to impress others in photo shoots due to large amounts of applied makeup. This is an indicator that the assumption may be unreliable for this model. Model Unstan Standar t Sig. 95.0% Correlat ions dardize dized Confide d Coeffici nce Coeffici ents Collinearity Statistics Interval ents B for B Std. Beta Lower (Consta Bound Partial Part Toleran order VIF ce -60.890 16.497 6.234 1.411 .942 4.418 .000 3.454 9.015 .397 .281 .265 .079 12.653 years -5.561 2.122 -.548 -2.621 .009 -9.743 -1.380 .337 -.171 -.157 .082 12.157 beauty -.196 .152 -.083 -1.289 .199 -.497 .104 .068 -.085 -.077 .867 1.153 nt) 1 a. age -3.691 Zero- Bound Error Upper .000 -93.396 -28.384 Dependent Variable: salary Collinearity Diagnosticsa Model Dimension Eigenvalue Condition Index Variance Proportions (Constant) age years beauty 1 1.000 .00 .00 .00 .00 2 .070 7.479 .01 .00 .08 .02 3 .004 30.758 .30 .02 .01 .94 4 1 3.925 .001 63.344 .69 .98 .91 .04 a. Dependent Variable: salary Case Number Std. Residual Casewise Diagnosticsa Salary per Predicted Day (? 5 4.603 116 3.422 135 4.672 dimension0 155 3.257 191 3.153 198 3.510 a. Dependent Variable: Salary per Day (? Value 95.34 64.79 89.98 74.86 50.66 71.32 Residual 28.2647 67.07340 14.9259 49.86537 21.8946 68.08541 27.4025 47.45824 4.7164 45.93938 20.1729 51.14779 We could find clearly that the salary per day is predicted by the age of the model and it shows the positive relationship between them. Also, we could find that there is a negative relationship between years spending on model and theirs salary. Salary = 0+1age2experience+2attractivene =-60.89+(6.23 age) - (5.56experience) (0.02 attractivenes Task 2: Using the Glastonbury data from this chapter (with the dummy coding in GlastonburyDummy. sav), which you shouldve already analyzed, comment on whether you think the model is reliable and generalizable. Model Summaryb Model R R Square Adjusted R Std. Error Change Square Statistics of the Durbin-Watson Estimate R Square F Change df1 df2 Sig. F Change 1 a .276 .076 .053 a. Predictors: (Constant), Indie_Kid, Crusty, Metaller b. Dependent Variable: change .68818 .076 Change 3.270 3 119 .024 1.893 ANOVAa Model Sum of Df Mean Square F Sig. Squares Regression 3 1.549 Residual 56.358 119 61.004 .024b .474 Total 1 4.646 3.270 122 a. Dependent Variable: change b. Predictors: (Constant), Indie_Kid, Crusty, Metaller Coefficientsa Model Unstand Standar ardized T Sig. dized Confide Coefficie Coefficie nts 95.0% Correlati Collinearity Statistics ons nce nts Interval for B B Std. Beta Lower (Consta nt) 1 Crusty Metaller Indie_Ki d -.554 .090 -.412 .167 .028 -.410 Zero- Bound Error Upper Bound Partial Part Toleranc order VIF e -6.134 .000 -.733 -.375 -.232 -2.464 .015 -.742 -.081 -.203 -.220 -.217 .879 1.138 .160 .017 .177 .860 -.289 .346 .112 .016 .016 .874 1.144 .205 -.185 -2.001 .048 -.816 -.004 -.147 -.180 -.176 .909 1.100 a. Dependent Variable: change Collinearity Diagnosticsa Model Dimension Eigenvalue Condition Index Variance Proportions (Constant) Crusty Metaller Indie_Kid 1 1 1.727 1.000 .14 .08 .08 .05 2 1.000 1.314 .00 .07 .36 .30 3 1.000 1.314 .00 .38 .04 .33 4 .273 2.515 .86 .48 .52 .32 a. Dependent Variable: change Casewise Diagnosticsa Case Number Std. Residual change Predicted Value Residual 31 -2.302 -2.55 -.9658 -1.58417 153 2.317 1.04 -.5543 1.59431 202 -2.653 -2.38 -.5543 -1.82569 346 -2.479 -2.26 -.5543 -1.70569 479 2.215 .97 -.5543 1.52431 a. Dependent Variable: change Normality of the histogram and the normal P-P Plot indicate that there is a normal distribution in the histogram and normal (straight) line from the dashed line in the P-P Plot. In the histogram the assumption of normality has been met. With the P-P Plot, the dashed line does not deviate from the straight line, hence assuming normality. Homoscedasticity mean that the residual levels at each level of the predictor(s) should have the same variance. When they are unequal, they are said to be heteroscedasticity. The scatterplot of ZPred vs. ZResid does show that the height of each predictor is the same, indicating that they have the same variance. This tells us that the model used is realistic, hence proving homoscedasticity. Task 3: A study was carried out to explore the relationship between Aggression and several potential predicting factors in children 666 that had an older sibling. Variables measured were Parenting Style (high score bad parenting practices), Computer Games (high score more time spent playing computer games), Television (high score more time spent watching television), Diet (high score the child has a good diet low in E-numbers), and Sibling Aggression (high score more aggression seen in their older sibling). Past research indicated that parenting style and sibling aggression were good predictors of the level of aggression in the younger child. All other variables were treated in an exploratory fashion. The data are in the file Child Aggression.sav. Analyse them with multiple regression. : A study was carried out to explore the relationship between Aggression and several potential predicting factors in 666 children that had an older sibling. Variables measured were Parenting Style (high score = bad parenting practices), Computer Games (high score = more time spent playing computer games), Television (high score = more time spent watching television), Diet (high score = the child has a good diet low in E-numbers), and Sibling Aggression (high score = more aggression seen in their older sibling). Past research indicated that parenting style and sibling aggression were good predictors of the level of Aggression in the younger child. All other variables were treated in an exploratory fashion. The data are in the file Child Aggression.sav. Analyse them with multiple regression. Based on this subject, we know that parenting style and sibling aggression were good predictors of the level of aggression in the younger child. So we separate these two variables from others to conduct this analysis. I select AnalyzeRegressionLinear to enter parenting style and sibling aggression in the first step (forced entry) and the remaining variables in a second step (stepwise) to get the output shown below: Model Summaryd Model Std. Adjusted R Error of Change R R the Statistic Square Square Estimate s R Square F Sig. F 1 .231a .053 Change Change .050 .31125 .053 df1 18.644 df2 Change 2 663 .000 dimensio 2 n0 .264b .070 .066 .30875 .017 11.787 1 662 .001 3 .286c .082 .076 .30697 .012 8.682 1 661 .003 a. Predictors: (Constant), Sibling Aggression, Parenting Style b. Predictors: (Constant), Sibling Aggression, Parenting Style, Use of Computer Games. Durbin-Watson c. Predictors: (Constant), Sibling Aggression, Parenting Style, Use of Computer Games., Good Diet d. Dependent Variable: Aggression 1.911 Coefficientsa Model 95.0% Unstandard Standardiz ized Confidence ed Interval for Coefficients Coefficients t B Collinearity Statistics Lower -.479 Bound .632 Bound -.029 Tolerance .018 Std. Error -.006 Beta .012 .062 .012 .194 5.057 .000 .038 .086 .970 .093 .038 Sig. .096 2.491 .013 .020 .167 .970 -.007 .012 -.574 .566 -.030 .017 Parenting .054 .012 .170 4.385 .000 .030 .079 .937 Style Sibling .068 .038 .070 1.793 .073 -.006 .142 .933 Aggression Use of .126 .037 .134 3.433 .001 .054 .197 .918 Games. (Constant) -.006 .012 -.497 .619 -.029 .017 Parenting .062 .013 .194 4.925 .000 .037 .087 .897 Style Sibling .086 .038 .088 2.258 .024 .011 .161 .908 Aggression Use of 1 B (Constant) Upper .143 .037 .153 3.891 .000 .071 .216 .893 -.112 .038 -.118 -2.947 .003 -.186 -.037 .870 Parenting Style Sibling 2 Aggression (Constant) VIF Computer 3 Computer Games. Good Diet a. Dependent Variable: Aggression Excluded Variablesd Model 1 Beta In t Sig. Collinearity Statistics .049a 1.091 .276 Tolerance .042 VIF .704 .134a 3.433 .001 .132 .918 1.090 .918 -.092a Time spent Minimum Tolerance 1.421 .704 -2.313 .021 -.090 .894 1.119 .894 .044b .986 .324 .038 .703 1.423 .703 -.114 .870 1.150 .870 .028 .697 1.436 .669 watching television. Use of Computer Games. Good Diet 2 Time spent watching Partial television. Good Diet 3 Correlation -2.947 .003 -.118b Time spent .032c .715 .475 watching television. a. Predictors in the Model: (Constant), Sibling Aggression, Parenting Style b. Predictors in the Model: (Constant), Sibling Aggression, Parenting Style, Use of Computer Games. c. Predictors in the Model: (Constant), Sibling Aggression, Parenting Style, Use of Computer Games., Good Diet d. Dependent Variable: Aggression Residuals Statisticsa Minimum -.4630 Maximum .3279 -1.15286 1.18037 .00000 .30605 666 Std. Predicted -5.011 3.643 .000 1.000 666 Value Std. Residual -3.756 3.845 .000 .997 666 Predicted Value Residual a. Dependent Variable: Aggression Mean Std. Deviation -.0050 .09139 N 666 Casewise Diagnosticsa Case Number Std. Aggression -3.067 Value -.93 157 3.845 1.13 -.0529 1.18037 169 3.182 .85 -.1251 .97673 200 3.026 .75 -.1805 .92899 221 3.205 1.14 .1543 .98372 270 -3.018 -.73 .1936 -.92649 439 -3.092 -.85 .1041 -.94922 440 -3.290 -.95 .0624 -1.00982 463 -3.756 -1.15 .0055 -1.15286 482 3.476 1.07 .0025 1.06707 505 -3.223 -1.12 -.1284 -.98938 539 dimension0 Residual 45 Predicted 3.416 1.18 .1300 1.04877 a. Dependent Variable: Aggression Residual .0106 -.94162 We could analysis from the graphs above. Sibling aggression ( =0.088, b=0.086, t=2.26 and p<0.05). It is a significantly predicted aggression. The represents when the time of spending playing computer games increases, the aggression increases too. This is a positive relationship between them. Parenting style ( b=0.062, =0.088, t=2.26 and p<0.05). It is a significantly predicted aggression. The represents when the sibling aggression increases, the aggression increases too. This is a positive relationship between them. Computer games ( b=0.143, =0.037, t=3.89 and p<0.001). It is a significantly predicted aggression. The represent that the time of spending playing computer games increases, the aggression increase too. This is a positive relationship between them. E-number (b=0.112,t=2.95, =0.118 and p<0.01.). It is a significantly predicted aggression. The represent that when the diet changes, the aggression decreases. This is a negative relationship between them. There is only one factor could not to predict aggression. This is Television (t=0.72, b=0.032 and p> .05). It is not a significantly predict aggression. By the analysis above, we could find that the actually parenting style is the most substantive predictor of aggression. And the computer games factor, good diet and sibling aggression are behind it. The scatterplot of ZPRED and ZRESID do not point the random pattern. There is no independence of errors assumption. And the residuals could be uncorrelated in this model. The output of the statistic states that the errors are reasonably independent. In conclusion, by the errors, we could find there is no violations of the assumptions.
