Unformatted text preview: EXST 7025 Lack of Fit Page 1 EXST7015: Marathon Footrace Example Mean age and time by gender The MEANS Procedure Gender=F Variable N Mean Std Dev Minimum Maximum -------------------------------------------------------------------------------Age 751 33.6790945 9.2423418 17.0000000 61.0000000 TIME 753 271.7157769 45.6046112 151.7500000 437.2700000 -------------------------------------------------------------------------------gender=M Variable N Mean Std Dev Minimum Maximum -------------------------------------------------------------------------------Age 1805 38.9800554 10.6100317 11.0000000 77.0000000 TIME 1808 246.5106250 42.4283145 134.8800000 428.7800000 -------------------------------------------------------------------------------EXST7015: Marathon Footrace Example Scatter plot gender=F Plot of TIME*Age. Legend: A = 1 obs, B = 2 obs, etc. TIME | | A | | | A A | A A A A A 400 + A A | A A A | A A A A A | A A A A A A A A A | A A B A A | A A A C A | A A A A A 350 + B A A A B A A A A | A A A B A A A A | A A A B A A A A | A A A B A A A A A A A B A | A A A A A B A A A A B B A A B A A | B C A B B A A C A A B A A | D A C B A A A A B A A A A A A A 300 + A B B A A B A A B B A A A A A A A A A | A B A B A A A A A A C C A B A A B A A C A | A B B B E B C D B E B A B A A B C D A B A A A A | A B C A D B D A C C B A B A A A C A A B C B A B A A A A | A B E D E B D A B B B C A A A B C A B A A A A A A | B A A A C B A E D B B C D B B D C B A B B C A B A A A A A | A B B A A C B A B B C C D B D A C D A B A B A B A C A A 250 + A B D A B A D C A A C C B C A A C A B C B A A A | B A A B A D B B B D B A B B B F B B F A A A A A A | A A B A B B B B B A B B C C A A D B A A D B A B C A | B B C A B B A A B B B C B A C A B A B A B A A A A | A A A B A A C A B A A A A A B A B A | A A A C A A B B A A B A A | A A A A B B B B 200 + A A A A A A A | A A A A A A A A | A | A A A | A A A A A | A A | A B 150 + A | ---+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+15 20 25 30 35 40 45 50 55 60 65 Age NOTE: 2 obs had missing values. EXST 7025 Lack of Fit Page 2 gender=M Plot of TIME*Age. Legend: A = 1 obs, B = 2 obs, etc. TIME | 450 + | | | A | | A | A 400 + A A | A A | A A A A | A A | A A | A A A A A B A A | AA A A 350 + A A A A A A A | A A A A B A A B AA | A A B B A A A AA A A A | A A A A C BA A B A A A | A A AA A AB AB CB AA A A A A AB A A A AA B A | AA B AC A B A A AA B B AA A C A A A A | A A AA BB B A B A AB AA B AA AA 300 + A A A A AC CB A AB CB AE A C B BA AB CB A A A A A A A A A | A AA A AA C B CB A A BB EC AB BD B B B C A B B A | A A BC D BA B CB BC AA CB EB AB AB BA CC AB BC BB B A A A A | A B A AA B BC BC BC FB D AA C A BA BA EA AB DA A CA A | A AA AB BB AE AB GC BC BC BC BB DB BA AA DF AD AC CA A A A A | A C AD AB AC A AC FE BE AB AE EC EC CC ID AB AA AD CA A B A | B AB AD CA BC AD AA CC DC BC CB B HG AC HE GA CF D AA B B AA D A A 250 + D BB DF A BE BB B AC AE CE DA D BA FJ BE BA A A AA | A AA AE AA CE BA GA CE BB HD BG GE BF GC BD EF CE DA A A A B A | BB BE BA AD FD CD CK BC FC CF EI EB LH JC CB DA DB BA AA E AA A A | A AA BE C BD ED BD CE D DF DI DD AG FB CD DE JF EC B BB B | A AA CA BB BC DE BB CF EC CH GC DF AC FE E F EB AA D AB BA A B A | A BB AA AB CA DC EB AB FC BC BE DC DB CB DC B EB C B A B | A A BA A AC AC BD BC DA CG CA CB DE EF GD C D BC A A A A 200 + B AA CA AB BC BD A BC CB C AC CB BB CE DA D BA D B C A | AB B A EA B C CE BB BB BD CA B AB B A A A | A A AA AA A A BA CA AA A BB BA A A A B A | A CB BB BC AB A BD A EA DA AA | A A AB B BB AA AA B A A AA | A AB BA AA B A A | A B A A A 150 + A AA A | A | A A A A A A | | | | 100 + | ---+--------------+--------------+--------------+--------------+--------------+--------------+------------10 20 30 40 50 60 70 80 Age NOTE: 3 obs had missing values. EXST 7025 Lack of Fit Page 3 EXST7015: Marathon Footrace Example Quartic model - separate by gender Quadratic model appears to be best gender=F The GLM Procedure Number of Observations Read Number of Observations Used 753 751 Dependent Variable: TIME Source Model Error Corrected Total R-Square 0.039795 DF 4 746 750 Coeff Var 16.49141 Sum of Squares 62075.714 1497806.037 1559881.751 Root MSE 44.80829 Mean Square 15518.928 2007.783 F Value 7.73 Pr > F <.0001 TIME Mean 271.7068 Source Age Age*Age Age*Age*Age Age*Age*Age*Age DF 1 1 1 1 Type I SS 26569.82118 34688.08613 769.49722 48.30901 Mean Square 26569.82118 34688.08613 769.49722 48.30901 F Value 13.23 17.28 0.38 0.02 Pr > F 0.0003 <.0001 0.5361 0.8768 Source Age Age*Age Age*Age*Age Age*Age*Age*Age DF 1 1 1 1 Type III SS 29.70792350 66.75100926 80.72370271 48.30901287 Mean Square 29.70792350 66.75100926 80.72370271 48.30901287 F Value 0.01 0.03 0.04 0.02 Pr > F 0.9032 0.8554 0.8411 0.8768 Mean Square 44490.030 1705.416 F Value 26.09 Pr > F <.0001 gender=M The GLM Procedure Number of Observations Read Number of Observations Used 1808 1805 Dependent Variable: TIME Source Model Error Corrected Total R-Square 0.054796 Coeff Var 16.75356 DF 4 1800 1804 Sum of Squares 177960.120 3069748.796 3247708.916 Root MSE 41.29668 TIME Mean 246.4949 Source Age Age*Age Age*Age*Age Age*Age*Age*Age DF 1 1 1 1 Type I SS 110167.2110 63624.7152 2714.0417 1454.1520 Mean Square 110167.2110 63624.7152 2714.0417 1454.1520 F Value 64.60 37.31 1.59 0.85 Pr > F <.0001 <.0001 0.2073 0.3559 Source Age Age*Age Age*Age*Age Age*Age*Age*Age DF 1 1 1 1 Type III SS 1356.226985 1717.023534 1855.193170 1454.151998 Mean Square 1356.226985 1717.023534 1855.193170 1454.151998 F Value 0.80 1.01 1.09 0.85 Pr > F 0.3726 0.3158 0.2971 0.3559 EXST 7025 Lack of Fit Page 4 EXST7015: Marathon Footrace Example Quadratic model - separate by gender gender=F The GLM Procedure Number of Observations Read Number of Observations Used Dependent Variable: TIME Source Model Error Corrected Total R-Square 0.039271 753 751 DF 2 748 750 Coeff Var 16.47384 Squares 61257.907 1498623.844 1559881.751 Root MSE 44.76056 Mean Square 30628.954 2003.508 Sum of F Value 15.29 Pr > F <.0001 TIME Mean 271.7068 Source Age Age*Age DF 1 1 Type I SS 26569.82118 34688.08613 Mean Square 26569.82118 34688.08613 F Value 13.26 17.31 Pr > F 0.0003 <.0001 Source Age Age*Age DF 1 1 Type III SS 25598.41747 34688.08613 Mean Square 25598.41747 34688.08613 F Value 12.78 17.31 Pr > F 0.0004 <.0001 Parameter Intercept Age Age*Age Standard Error 20.64142442 1.19162916 0.01639299 Estimate 331.9717727 -4.2594366 0.0682107 The GLM Procedure Number of Observations Read Number of Observations Used t Value 16.08 -3.57 4.16 Pr > |t| <.0001 0.0004 <.0001 1808 1805 gender=M Dependent Variable: TIME Source Model Error Corrected Total R-Square 0.053512 DF 2 1802 1804 Coeff Var 16.75563 Sum of Squares 173791.926 3073916.990 3247708.916 Root MSE 41.30177 Mean Square 86895.963 1705.836 F Value 50.94 Pr > F <.0001 TIME Mean 246.4949 Source Age Age*Age DF 1 1 Type I SS 110167.2110 63624.7152 Mean Square 110167.2110 63624.7152 F Value 64.58 37.30 Pr > F <.0001 <.0001 Source Age Age*Age DF 1 1 Type III SS 37836.81687 63624.71523 Mean Square 37836.81687 63624.71523 F Value 22.18 37.30 Pr > F <.0001 <.0001 Parameter Intercept Age Age*Age Estimate 280.7072514 -2.6374056 0.0420317 Standard Error 10.94798326 0.56000035 0.00688229 t Value 25.64 -4.71 6.11 Pr > |t| <.0001 <.0001 <.0001 EXST 7025 Lack of Fit Page 5 EXST7015: Marathon Footrace Example Quadratic model - separate by gender Analysis of residuals gender=F The UNIVARIATE Procedure Variable: resid Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 751 0 44.7008403 0.67389301 1498623.84 . Tests for Normality Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean --Statistic--W 0.970257 D 0.073169 W-Sq 1.079704 A-Sq 6.537907 751 0 1998.16512 0.86755007 1498623.84 1.63115683 -----p Value-----Pr < W <0.0001 Pr > D <0.0100 Pr > W-Sq <0.0050 Pr > A-Sq <0.0050 Histogram Boxplot 170+* .* .** .**** .****** .******* .************** 30+****************** .************************************ .*************************************** .************************************ .******************* .******** .*** -110+** ----+----+----+----+----+----+----+---* may represent up to 4 counts 1 4 7 13 22 25 53 69 142 154 141 76 29 9 6 0 0 0 0 0 | | +-----+ | + | *-----* +-----+ | | | 0 Normal Probability Plot 170+ * | * | *** | **** +++ | ****++++ | ***++ | +**** 30+ ++**** | +****** | ****** | ******* | *******+ | *****++ | ****+++ -110+**+ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 EXST 7025 Lack of Fit Page 6 gender=M The UNIVARIATE Procedure Variable: resid Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 1805 0 41.2788701 0.54376942 3073916.99 . Tests for Normality Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean --Statistic--W 0.98382 D 0.057579 W-Sq 1.195687 A-Sq 6.850533 1805 0 1703.94512 0.57299212 3073916.99 0.97160378 -----p Value-----Pr < W <0.0001 Pr > D <0.0100 Pr > W-Sq <0.0050 Pr > A-Sq <0.0050 Histogram Boxplot 190+* .* .* 130+* .** .****** 70+********* .***************** .************************ 10+********************************** .*********************************************** .********************************* -50+********************* .********** .** -110+* ----+----+----+----+----+----+----+----+----+-* may represent up to 9 counts 1 1 4 7 16 46 74 148 209 298 419 294 183 87 12 6 Normal Probability Plot 190+ * | * | * 130+ * | ****+ | *****+++ 70+ ****++ | ***** | +***** 10+ +****** | ******* | ****** -50+ ******* | ********+ |***+++ -110+* +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 * 0 0 0 0 | | | +-----+ | + | *-----* +-----+ | | | 0 EXST 7025 Lack of Fit Page 7 EXST7015: Marathon Footrace Example Quadratic model fitted to means - separated by gender gender=F The GLM Procedure Number of Observations Used 45 Dependent Variable: timemean Weight: n Source Model Error Corrected Total R-Square 0.461120 DF 2 42 44 Coeff Var 15.19481 Sum of Squares 61257.9073 71588.0980 132846.0053 Root MSE 41.28533 Mean Square 30628.9537 1704.4785 F Value 17.97 Pr > F <.0001 timemean Mean 271.7068 Source Age Age*Age DF 1 1 Type I SS 26569.82118 34688.08613 Mean Square 26569.82118 34688.08613 F Value 15.59 20.35 Pr > F 0.0003 <.0001 Source Age Age*Age DF 1 1 Type III SS 25598.41747 34688.08613 Mean Square 25598.41747 34688.08613 F Value 15.02 20.35 Pr > F 0.0004 <.0001 Parameter Intercept Age Age*Age Standard Error 19.03881485 1.09911053 0.01512023 Estimate 331.9717727 -4.2594366 0.0682107 Number of Observations Used t Value 17.44 -3.88 4.51 Pr > |t| <.0001 0.0004 <.0001 59 gender=M Dependent Variable: timemean Weight: n Source Model Error Corrected Total R-Square 0.614839 Coeff Var 17.88766 DF 2 56 58 Sum of Squares 173791.9262 108870.6920 282662.6182 Root MSE 44.09217 Mean Square 86895.9631 1944.1195 F Value 44.70 Pr > F <.0001 timemean Mean 246.4949 Source Age Age*Age DF 1 1 Type I SS 110167.2110 63624.7152 Mean Square 110167.2110 63624.7152 F Value 56.67 32.73 Pr > F <.0001 <.0001 Source Age Age*Age DF 1 1 Type III SS 37836.81687 63624.71523 Mean Square 37836.81687 63624.71523 F Value 19.46 32.73 Pr > F <.0001 <.0001 Parameter Intercept Age Age*Age Estimate 280.7072514 -2.6374056 0.0420317 Standard Error 11.68764276 0.59783468 0.00734727 t Value 24.02 -4.41 5.72 Pr > |t| <.0001 <.0001 <.0001 EXST 7025 Lack of Fit Page 8 Estimates of Age at the minimum time and the mean time for that age. Estimates for Males Parameter Intercept Age Age*Age Age at the minimum Time at Age Full data set 280.7072514 -2.6374056 0.0420317 31.37400581 239.3342621 Weighted means 280.7072514 -2.6374056 0.0420317 31.37400581 239.3342621 Unweighted means 267.6384295 -2.2006212 0.0393577 27.95667938 236.8773988 Estimates for Females Parameter Intercept Age Age*Age Age at the minimum Time at Age Full data set 331.9717727 -4.2594366 0.0682107 31.22264249 265.4763396 Weighted means 331.9717727 -4.2594366 0.0682107 31.22264249 265.4763396 Unweighted means 325.6661219 -3.6279436 0.0571346 31.74909424 268.0741603 EXST7015: Marathon Footrace Example Quadratic model fitted to means - separated by gender Analysis of residuals gender=F The UNIVARIATE Procedure Variable: timemean Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 46 280.592144 24.2205729 1.37948233 3648068.39 8.63194976 Tests for Normality Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean --Statistic--W 0.851385 D 0.209628 W-Sq 0.48012 A-Sq 2.671507 46 12907.2386 586.636152 1.68118152 26398.6268 3.57112865 -----p Value-----Pr < W <0.0001 Pr > D <0.0100 Pr > W-Sq <0.0050 Pr > A-Sq <0.0050 EXST 7025 Stem 35 34 33 32 31 30 29 28 27 26 25 24 23 Leaf 5 Lack of Fit Boxplot 1 2 1 3 69 8 069 00468 0157 12234556677 1222333455566779 89 0 0 0 | | +-----+ | + | *-----* +-----+ | | | 5 4 11 16 2 7 1 ----+----+----+----+ Multiply Stem.Leaf by 10**+1 Page 9 Normal Probability Plot 355+ * | + | * * ++++ | * ++++ | ***+++ | ++++ 295+ +++**** | ++++**** | ++******* | ************* | * * ++++ | ++++ 235+ * ++++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 EXST7015: Marathon Footrace Example Quadratic model fitted to means - separated by gender Analysis of residuals gender=M The UNIVARIATE Procedure Variable: timemean Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 60 257.936116 29.3291978 1.37205209 4042614.31 11.3707217 Tests for Normality Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling Stem 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 Leaf 8 Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean --Statistic--W 0.892458 D 0.153812 W-Sq 0.416692 A-Sq 2.205646 Boxplot 1 * 3 1 0 245 3 0 88 22488 29 113478 234566999 0223334445555666779 0234668 577 2 5 2 6 9 19 7 3 56 2 ----+----+----+----+ Multiply Stem.Leaf by 10**+1 | | | +-----+ | + | *-----* | | | | 60 15476.167 860.201845 2.76842107 50751.9089 3.78638316 -----p Value-----Pr < W <0.0001 Pr > D <0.0100 Pr > W-Sq <0.0050 Pr > A-Sq <0.0050 Normal Probability Plot 365+ * | | | * + | +++ | * **++++ | +++ | +**+ 285+ +*** | +++* | ++++*** | +++**** | ********* | *****+ | ** **+++ | ++++ 205+ * *++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 EXST 7025 Lack of Fit Page 10 EXST7015: Marathon Footrace Example Quadratic model fitted to means - separated by gender Analysis of residuals gender=F Plot of timemean*Age. Plot of pred*Age. Symbol used is 'x'. Symbol used is 'o'. 360 + | | x | | | | 340 + x | x | | | x | o | o 320 + x | x o | o timemean | x o | o | | o 300 + o | x o x | o x | x o | x o | x x o o | o 280 + o x o x | o x x x x o o | x o o x o o | o x x o o x x | x o o o o o | x x x o o o o o o o o o x x x x | x x x x x x x 260 + x x | x | | | | | 240 + | x | | | | | 220 + ---+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------15 20 25 30 35 40 45 50 55 60 65 Age NOTE: 2 obs had missing values. 4 obs hidden. EXST 7025 Lack of Fit Page 11 EXST7015: Marathon Footrace Example Quadratic model fitted to means - separated by gender Analysis of residuals gender=M Plot of timemean*Age. Plot of pred*Age. Symbol used is 'x'. Symbol used is 'o'. timemean | | 380 + | | | x | 360 + | | | | 340 + | | x | o | 320 + | x | x o x | o | o 300 + x o x | | o | ox x | x o x 280 + x x o o | o | xo | x oo x | oo x x 260 + x x x xo x | oo xx oo x | x o oo x | o oo x x x x o oo o x | oo ox xx xx x x x xx oo ox x x 240 + o ox oo oo oo oo oo ox o | x x | x x x | x x x | x 220 + | | | | x x 200 + | ---+--------------+--------------+--------------+--------------+--------------+--------------+----------10 20 30 40 50 60 70 80 Age NOTE: 2 obs had missing values. 8 obs hidden. EXST 7025 Lack of Fit Page 12 EXST7015: Marathon Footrace Example Lack of Fit calculations gender=F The GLM Procedure Class Level Information Class Levels Values Also_age 45 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 Number of Observations Used 751 Dependent Variable: TIME Source Model Error Corrected Total R-Square 0.085164 Coeff Var 16.54682 DF 44 706 750 Sum of Squares 132846.009 1427035.742 1559881.751 Root MSE 44.95884 Mean Square 3019.227 2021.297 F Value 1.49 Pr > F 0.0227 TIME Mean 271.7068 Source Age Age*Age Also_age DF 1 1 42 Type I SS 26569.82118 34688.08613 71588.10138 Mean Square 26569.82118 34688.08613 1704.47860 F Value 13.14 17.16 0.84 Pr > F 0.0003 <.0001 0.7493 Source Age Age*Age Also_age DF 0 0 42 Type III SS 0.00000 0.00000 71588.10133 Mean Square . . 1704.47860 F Value . . 0.84 Pr > F . . 0.7493 gender=M Class Level Information Class Levels Values Also_age 59 11 12 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 69 70 72 73 77 Number of Observations Used 1805 Dependent Variable: TIME Source Model Error Corrected Total R-Square 0.087034 Coeff Var 16.71805 DF 58 1746 1804 Sum of Squares 282662.620 2965046.296 3247708.916 Root MSE 41.20915 Mean Square 4873.493 1698.194 F Value 2.87 Pr > F <.0001 TIME Mean 246.4949 Source Age Age*Age Also_age DF 1 1 56 Type I SS 110167.2110 63624.7152 108870.6938 Mean Square 110167.2110 63624.7152 1944.1195 F Value 64.87 37.47 1.14 Pr > F <.0001 <.0001 0.2180 Source Age Age*Age Also_age DF 0 0 56 Type III SS 0.0000 0.0000 108870.6938 Mean Square . . 1944.1195 F Value . . 1.14 Pr > F . . 0.2180 ...
