01 LOF and ANCOVA

01 LOF and ANCOVA - Biological Population Statistics II...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Biological Population Statistics II Page 1 SIMPLE LINEAR REGRESSION (review - unify concepts and notation) 1) MODEL: Yi = β 0 + β1 X i + ε i or Yij = β 0 + β1 X i + ε ij a) subscript “i” identifies individual values of X b) subscript “j”, when used, identifies individual values of Y within the individual values of Xi c) β0 and β1 are regression coefficients and are population parameters (1) β0 is the population intercept (2) β1 is the population slope d) εij is the error term (deviations from regression or residuals) e) The deviations from regression can be separated into two components (1) PURE ERROR - deviations of Yij within individual Xi values n ∑ (Y j =1 ij 2 − Yi. ) gives the pooled within Xi sum of squares This is the same error which would be obtained if an ANOVA was done using Xi as treatments" (CRD, possibly unbalanced) This error can be calculated only if there is some replication of Yij values at some of the Xi values (2) LACK OF FIT the difference between the sum of squares for Error (SSError) for regression and the PURE ERROR. This is also the summed of squared deviations from the regression line of the mean Y values at each Xi value (e.g. Yi. ) i.e. ∑( t i =1 ˆ Yi. − Yi. ) 2 This measures the failure of the means (at each Xi value) to fall on the regression line This may be the only available error (if the data analyzed is means), or if the PURE ERROR is not measurable (only 1 observation at each Xi value) REPLICATION IS NEEDED to calculate a pure error term apart from the LOF. LOF can be tested and is used to evaluate the curvature or adequacy of a particular fitted model Note that no model can explain PE, the R2 can never achieve a value of 100% if PE exists; while we could potentially explain 100% of the Lack of Fit Biological Population Statistics II Page 2 (3) PURE ERROR AND LACK OF FIT GRAPHICALLY Y variable axis intercept X variable axis where; ─ represents the mean of Yij at Xi [ie. Yi. ] | represents the distribution of Yij at Xi } indicates the deviations contributing to PURE ERROR ↕ indicates the deviations contributing to LACK OF FIT (4) PE and LOF in SAS DATA ONE; INPUT Y X: CX = X; PROC GLM; CLASSES CX; MODEL Y = X CX; RUN; X fits the regression (TYPE I SS, since TYPE III do not exist) CX fits means (as an ANOVA) and TYPE I SS gives the LOF. The residual error is pure error. General Linear Hypothesis Test (GLHT) Model d.f. SSError MS F test Reduced dfReduced SSErrorReduced Full dfFull SSErrorFull Difference dfDifference SSErrorDifference MSEDifference F = MSDiff/MSError Full dfFull SSErrorFull MSEFull EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 1 1 /*********************************************************************/ 2 /*** Geaghan MS Thesis Flier data ***/ 3 /*********************************************************************/ 4 dm'log;clear;output;clear'; 5 OPTIONS NOCENTER PS=512 LS=111 NODATE NONUMBER nolabel; 6 7 LIBNAME SASDATA 'C:\Geaghan\Current\EXST7025\Spring2008\Flier\'; NOTE: Libref SASDATA was successfully assigned as follows: Engine: V9 Physical Name: C:\Geaghan\Current\EXST7025\Spring2008\Flier 8 FILENAME INPUT 'C:\Geaghan\Current\EXST7025\Spring2008\Flier\ONEper.csv'; 9 TITLE1 'Growth Curves fitted to Flier sunfish'; 10 11 DATA Flier; infile input missover delimiter="," firstobs=2; 12 input Mo Day Yr Ar St Md sx sex $ Sn Dayno Age Age_Days Lt Wt TSL 13 Size1 Size2 Size3 Size4 Size5 Size6 Edge EdgeGrow K FNO; 14 if lt gt 14.0 and wt lt 20 then wt = .; 15 IF AGE EQ . THEN DELETE; 16 IF LT EQ . THEN DELETE; 17 IF LT EQ 0 THEN DELETE; 18 IF TSL EQ . THEN DELETE; 19 LLT= LOG(LT); 20 *LTSL= LOG(TSL); 21 AGE2 = AGE*AGE; 22 AGE3 = AGE*AGE*AGE; 23 IF AGE GT 0 THEN LAGE = LOG(AGE); ELSE LAGE=.; 24 AGEAGAIN = AGE; 25 RUN; NOTE: The infile INPUT is: File Name=C:\Geaghan\Current\EXST7025\Spring2008\Flier\ONEper.csv, RECFM=V,LRECL=256 NOTE: 664 records were read from the infile INPUT. The minimum record length was 66. The maximum record length was 114. NOTE: The data set WORK.FLIER has 664 observations and 30 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 29 30 PROC REG DATA=FLIER; 31 TITLE2 'Polynomial models fitted to raw data with REG'; 32 LINEAR:MODEL LT = AGE; 33 QUAD:MODEL LT = AGE AGE2; 34 CUBIC:MODEL LT = AGE AGE2 AGE3; 35 RUN; NOTE: The PROCEDURE REG printed pages 1-3. NOTE: PROCEDURE REG used (Total process time): real time 0.03 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Polynomial models fitted to raw data with REG The REG Procedure Model: LINEAR Dependent Variable: Lt Number of Observations Read Number of Observations Used 664 664 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 2 Analysis of Variance Source Model Error Corrected Total DF 1 662 663 Root MSE Dependent Mean Coeff Var 1.59311 12.07169 13.19712 Sum of Squares 2410.84161 1680.16610 4091.00771 R-Square Adj R-Sq Mean Square 2410.84161 2.53802 F Value 949.89 Pr > F <.0001 0.5893 0.5887 Parameter Estimates Variable Intercept Age DF 1 1 Parameter Estimate 7.66681 1.74097 Standard Error 0.15572 0.05649 t Value 49.23 30.82 Pr > |t| <.0001 <.0001 Mean Square 1252.66143 2.39892 F Value 522.18 Analysis of Variance Source Model Error Corrected Total DF 2 661 663 Root MSE Dependent Mean Coeff Var 1.54884 12.07169 12.83039 Sum of Squares 2505.32285 1585.68486 4091.00771 R-Square Adj R-Sq Pr > F <.0001 0.6124 0.6112 Parameter Estimates Variable Intercept Age AGE2 DF 1 1 1 Parameter Estimate 6.52891 2.83257 -0.21370 Standard Error 0.23621 0.18240 0.03405 t Value 27.64 15.53 -6.28 Pr > |t| <.0001 <.0001 <.0001 Mean Square 835.24309 2.40194 F Value 347.74 Analysis of Variance Source Model Error Corrected Total DF 3 660 663 Root MSE Dependent Mean Coeff Var 1.54982 12.07169 12.83846 Sum of Squares 2505.72928 1585.27843 4091.00771 R-Square Adj R-Sq 0.6125 0.6107 Parameter Estimates Variable Intercept Age AGE2 AGE3 DF 1 1 1 1 Parameter Estimate 6.60679 2.68838 -0.15030 -0.00769 Standard Error 0.30283 0.39519 0.15784 0.01869 t Value 21.82 6.80 -0.95 -0.41 Pr > |t| <.0001 <.0001 0.3413 0.6809 Pr > F <.0001 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 3 36 PROC GLM DATA=FLIER; 37 TITLE2 'Polynomial models fitted to raw data with GLM'; 38 MODEL LT = AGE AGE*AGE AGE*AGE*AGE; 39 RUN; 40 NOTE: The PROCEDURE GLM printed pages 4-5. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Polynomial models fitted to raw data with GLM The GLM Procedure Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.612497 DF 3 660 663 Coeff Var 12.83846 Sum of Squares 2505.729282 1585.278429 4091.007711 Root MSE 1.549818 Mean Square 835.243094 2.401937 F Value 347.74 Pr > F <.0001 Lt Mean 12.07169 Source Age Age*Age Age*Age*Age DF 1 1 1 Type I SS 2410.841612 94.481241 0.406429 Mean Square 2410.841612 94.481241 0.406429 F Value 1003.71 39.34 0.17 Pr > F <.0001 <.0001 0.6809 Source Age Age*Age Age*Age*Age DF 1 1 1 Type III SS 111.1577870 2.1777521 0.4064290 Mean Square 111.1577870 2.1777521 0.4064290 F Value 46.28 0.91 0.17 Pr > F <.0001 0.3413 0.6809 Parameter Intercept Age Age*Age Age*Age*Age Estimate 6.606789050 2.688383688 -0.150297865 -0.007688027 Standard Error 0.30283109 0.39518651 0.15784448 0.01868975 t Value 21.82 6.80 -0.95 -0.41 Pr > |t| <.0001 <.0001 0.3413 0.6809 41 proc sort data=flier; by sex; 42 run; NOTE: There were 664 observations read from the data set WORK.FLIER. NOTE: The data set WORK.FLIER has 664 observations and 30 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 43 proc means data=flier; by sex; var lt; 44 TITLE2 'Means by sex'; 45 run; NOTE: There were 664 observations read from the data set WORK.FLIER. NOTE: The PROCEDURE MEANS printed page 6. NOTE: PROCEDURE MEANS used (Total process time): EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit 46 NOTE: NOTE: NOTE: 47 48 49 50 NOTE: NOTE: NOTE: 51 NOTE: NOTE: NOTE: Page 4 real time 0.00 seconds cpu time 0.00 seconds proc sort data=flier; by age; run; There were 664 observations read from the data set WORK.FLIER. The data set WORK.FLIER has 664 observations and 30 variables. PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.01 seconds proc means data=flier noprint; by age; var lt; TITLE2 'Means by age'; output out=means n=ltn mean=ltmean var=ltvar std=ltstd; run; There were 664 observations read from the data set WORK.FLIER. The data set WORK.MEANS has 7 observations and 7 variables. PROCEDURE MEANS used (Total process time): real time 0.01 seconds cpu time 0.01 seconds proc print data=means; run; There were 7 observations read from the data set WORK.MEANS. The PROCEDURE PRINT printed page 7. PROCEDURE PRINT used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Means by sex sex=F Analysis Variable : Lt N Mean Std Dev Minimum Maximum ------------------------------------------------------------------340 12.0200000 2.2375213 5.3000000 18.0000000 ------------------------------------------------------------------sex=I Analysis Variable : Lt N Mean Std Dev Minimum Maximum ------------------------------------------------------------------18 10.5111111 3.7463533 3.5000000 16.1000000 ------------------------------------------------------------------sex=M Analysis Variable : Lt N Mean Std Dev Minimum Maximum ------------------------------------------------------------------306 12.2209150 2.6262522 5.4000000 18.5000000 ------------------------------------------------------------------- Growth Curves fitted to Flier sunfish Means by age Obs 1 2 3 4 5 6 7 Age 0 1 2 3 4 5 6 _TYPE_ 0 0 0 0 0 0 0 _FREQ_ 20 94 196 246 89 14 5 ltn 20 94 196 246 89 14 5 ltmean 7.6550 8.4649 11.5704 13.1809 14.1596 15.1571 16.8200 ltvar 7.55945 1.88768 2.37122 2.29045 1.43835 1.09802 2.13700 ltstd 2.74944 1.37393 1.53988 1.51342 1.19931 1.04787 1.46185 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 5 56 PROC GLM DATA=means; weight ltn; 57 TITLE2 'Polynomial models fitted to means with GLM'; 58 MODEL LTmean = AGE; 59 RUN; 60 NOTE: The PROCEDURE GLM printed pages 8-9. NOTE: PROCEDURE GLM used (Total process time): real time 0.23 seconds cpu time 0.10 seconds Growth Curves fitted to Flier sunfish Linear model fitted to means with GLM The GLM Procedure Number of Observations Read Number of Observations Used 7 7 Dependent Variable: ltmean Weight: ltn Sum of Squares 2410.841612 188.037212 2598.878824 Source Model Error Corrected Total DF 1 5 6 R-Square 0.927647 Root MSE 6.132491 Coeff Var 50.80061 Mean Square 2410.841612 37.607442 F Value 64.11 Pr > F 0.0005 ltmean Mean 12.07169 Source Age DF 1 Type I SS 2410.841612 Mean Square 2410.841612 F Value 64.11 Pr > F 0.0005 Source Age DF 1 Type III SS 2410.841612 Mean Square 2410.841612 F Value 64.11 Pr > F 0.0005 Parameter Intercept Age Estimate 7.666813596 1.740973674 Standard Error 0.59942430 0.21744267 t Value 12.79 8.01 Pr > |t| <.0001 0.0005 58 PROC glm DATA=FLIER; 59 TITLE2 'ANOVA fitted to raw data with GLM'; 60 class age; MODEL LT = AGE; 61 RUN; NOTE: The PROCEDURE GLM printed pages 10-11. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 6 Growth Curves fitted to Flier sunfish ANOVA fitted to raw data with GLM The GLM Procedure Class Level Information Class Levels Values Age 7 0 1 2 3 4 5 6 Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.635266 DF 6 657 663 Coeff Var 12.48396 Sum of Squares 2598.878824 1492.128887 4091.007711 Root MSE 1.507025 Mean Square 433.146471 2.271125 F Value 190.72 Pr > F <.0001 Lt Mean 12.07169 Source Age DF 6 Type I SS 2598.878824 Mean Square 433.146471 F Value 190.72 Pr > F <.0001 Source Age DF 6 Type III SS 2598.878824 Mean Square 433.146471 F Value 190.72 Pr > F <.0001 F Value 190.72 Pr > F <.0001 62 PROC glm DATA=FLIER; 63 TITLE2 'Linear model with categorical age'; 64 class AGEAGAIN; MODEL LT = AGE AGEAGAIN; 65 RUN; NOTE: The PROCEDURE GLM printed pages 12-13. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Linear model with categorical age The GLM Procedure Class Level Information Class Levels Values AGEAGAIN 7 0 1 2 3 4 5 6 Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total DF 6 657 663 Sum of Squares 2598.878824 1492.128887 4091.007711 Mean Square 433.146471 2.271125 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit R-Square 0.635266 Coeff Var 12.48396 Page 7 Root MSE 1.507025 Lt Mean 12.07169 Source Age AGEAGAIN DF 1 5 Type I SS 2410.841612 188.037212 Mean Square 2410.841612 37.607442 F Value 1061.52 16.56 Pr > F <.0001 <.0001 Source Age AGEAGAIN DF 0 5 Type III SS 0.0000000 188.0372117 Mean Square . 37.6074423 F Value . 16.56 Pr > F . <.0001 Mean Square 433.146471 2.271125 F Value 190.72 Pr > F <.0001 66 PROC glm DATA=FLIER; 67 TITLE2 'Quadratic model with categorical age'; 68 class AGEAGAIN; MODEL LT = AGE age*age AGEAGAIN; 69 RUN; NOTE: The PROCEDURE GLM printed pages 14-15. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Quadratic model with categorical age The GLM Procedure Class Level Information Class Levels Values AGEAGAIN 7 0 1 2 3 4 5 6 Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.635266 Coeff Var 12.48396 DF 6 657 663 Sum of Squares 2598.878824 1492.128887 4091.007711 Root MSE 1.507025 Lt Mean 12.07169 Source Age Age*Age AGEAGAIN DF 1 1 4 Type I SS 2410.841612 94.481241 93.555971 Mean Square 2410.841612 94.481241 23.388993 F Value 1061.52 41.60 10.30 Pr > F <.0001 <.0001 <.0001 Source Age Age*Age AGEAGAIN DF 0 0 4 Type III SS 0.00000000 0.00000000 93.55597075 Mean Square . . 23.38899269 F Value . . 10.30 Pr > F . . <.0001 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 8 70 PROC glm DATA=FLIER; 71 TITLE2 'Cubic model with categorical age'; 72 class AGEAGAIN; MODEL LT = AGE age*age age*age*age AGEAGAIN; 73 RUN; NOTE: The PROCEDURE GLM printed pages 16-17. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Cubic model with categorical age The GLM Procedure Class Level Information Class Levels Values AGEAGAIN 7 0 1 2 3 4 5 6 Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.635266 DF 6 657 663 Coeff Var 12.48396 Sum of Squares 2598.878824 1492.128887 4091.007711 Root MSE 1.507025 Mean Square 433.146471 2.271125 F Value 190.72 Pr > F <.0001 Lt Mean 12.07169 Source Age Age*Age Age*Age*Age AGEAGAIN DF 1 1 1 3 Type I SS 2410.841612 94.481241 0.406429 93.149542 Mean Square 2410.841612 94.481241 0.406429 31.049847 F Value 1061.52 41.60 0.18 13.67 Pr > F <.0001 <.0001 0.6724 <.0001 Source Age Age*Age Age*Age*Age AGEAGAIN DF 0 0 0 3 Type III SS 0.00000000 0.00000000 0.00000000 93.14954178 Mean Square . . . 31.04984726 F Value . . . 13.67 Pr > F . . . <.0001 74 75 76 77 78 PROC glm DATA=FLIER; TITLE2 'Analysis of Variance model or regression?'; MODEL LT = AGE age*age age*age*age age*age*age*age age*age*age*age*age age*age*age*age*age*age; RUN; Growth Curves fitted to Flier sunfish Analysis of Variance model or regression? The GLM Procedure Number of Observations Read Number of Observations Used 664 664 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 9 Growth Curves fitted to Flier sunfish Analysis of Variance model or regression? The GLM Procedure Dependent Variable: Lt Source Model Error Corrected Total DF 6 657 663 Sum of Squares 2598.878824 1492.128887 4091.007711 R-Square Coeff Var Root MSE 12.48396 1.507025 F Value 190.72 Pr > F <.0001 Lt Mean 0.635266 Mean Square 433.146471 2.271125 12.07169 Source Age Age*Age Age*Age*Age Age*Age*Age*Age Age*Age*Age*Age*Age Ag*Ag*Ag*Age*Age*Age DF 1 1 1 1 1 1 Type I SS 2410.841612 94.481241 0.406429 66.474931 23.647706 3.026904 Mean Square 2410.841612 94.481241 0.406429 66.474931 23.647706 3.026904 F Value 1061.52 41.60 0.18 29.27 10.41 1.33 Pr > F <.0001 <.0001 0.6724 <.0001 0.0013 0.2487 Source Age Age*Age Age*Age*Age Age*Age*Age*Age Age*Age*Age*Age*Age Ag*Ag*Ag*Age*Age*Age DF 1 1 1 1 1 1 Type III SS 14.82670511 19.52043024 11.40856838 6.78882877 4.34858568 3.02690393 Mean Square 14.82670511 19.52043024 11.40856838 6.78882877 4.34858568 3.02690393 F Value 6.53 8.60 5.02 2.99 1.91 1.33 Pr > F 0.0108 0.0035 0.0253 0.0843 0.1669 0.2487 Parameter Intercept Age Age*Age Age*Age*Age Age*Age*Age*Age Age*Age*Age*Age*Age Ag*Ag*Ag*Age*Age*Age Estimate 7.655000000 -4.583897309 9.134809427 -4.800279839 1.198480149 -0.146266417 0.007047605 Standard Error 0.33698106 1.79404379 3.11584097 2.14176327 0.69319257 0.10570392 0.00610468 t Value 22.72 -2.56 2.93 -2.24 1.73 -1.38 1.15 Pr > |t| <.0001 0.0108 0.0035 0.0253 0.0843 0.1669 0.2487 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Various Fitted models Corrected total Linear mode - Raw data Quadratic mode - Raw data Cubic mode - Raw data ANOVA on AGE Page 10 d.f. model SSModel 1 2 3 6 2410.842 2505.323 2505.729 2598.879 Differences in fit compared to ANOVA Diff - Linear vrs Anova Diff - Quad vrs Anova Diff - Cubic vrs Anova d.f.. Error 663 662 661 660 657 SSError 4091.008 1680.166 1585.685 1585.278 1492.129 MSE Model Change R2 2.538 2.399 2.402 2.271 2410.842 58.930 94.481 61.240 0.406 61.250 SSError MSE 188.037 37.607 R2 92.765 F test of diff P value 5 188.037214 16.55895153 2.041E-15 4 93.555974 10.29841917 4.299E-08 3 93.149544 13.67157376 1.166E-08 Linear model fitted to means (weighted) Source d.f. model Age 1 SSModel 2410.842 Full polynomial model fitted to Raw data Source Age Age2 Age3 Age4 Age5 Age6 Model Sum of Squares (sum of Type I) Type I SS Mean Square 2410.842 2410.842 94.481 94.481 0.406 0.406 66.475 66.475 23.648 23.648 3.027 3.027 2598.879 DF 1 1 1 1 1 1 6 d.f.. Error 5 F Value 1061.520 41.600 0.180 29.270 10.410 1.330 Pr > F <.0001 <.0001 0.6724 <.0001 0.0013 0.2487 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 11 81 OPTIONS PS=52 LS=111; NOTE: The PROCEDURE GLM printed pages 18-19. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 82 proc plot data=flier; plot lt*age; run; NOTE: There were 664 observations read from the data set WORK.FLIER. NOTE: The PROCEDURE PLOT printed page 20. NOTE: PROCEDURE PLOT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 83 proc plot data=flier; plot wt*lt=sex; run; NOTE: There were 664 observations read from the data set WORK.FLIER. NOTE: The PROCEDURE PLOT printed page 21. NOTE: PROCEDURE PLOT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 84 proc plot data=means; plot ltmean*age; run; 85 OPTIONS PS=512 LS=111; 86 NOTE: There were 7 observations read from the data set WORK.MEANS. NOTE: The PROCEDURE PLOT printed page 22. NOTE: PROCEDURE PLOT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds Growth Curves fitted to Flier sunfish Analysis of Variance model or regression? Plot of Lt*Age. Legend: A = 1 obs, B = 2 obs, etc. Lt | | 20 + | | | A 18 + A A | A B | A C A | E A B 16 + A K D A | A G C E B | F L I B | E P N A 14 + E X O A | A A I Z Q A | A O Z L A | A U Z H 12 + Y Z A | B U R | B S K | C U H 10 + B F Z B | K J | L I A | L A 8 + Q | A N | E D | C D 6 + C E | B A | | 4 + | A | | 2 + | ---+--------------+--------------+--------------+--------------+--------------+--------------+-0 1 2 3 4 5 6 Age NOTE: 25 obs hidden. EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 12 Growth Curves fitted to Flier sunfish Analysis of Variance model or regression? Plot of Wt*Lt. Symbol is value of sex. Wt | | 140 + M | MM | M M | M | F 120 + | | MM FM | MM | MF 100 + M F | F M FFM | MFF F | M M F MF IM | MMFMF FF 80 + F M MF MFF | FFFFMMFMFIF | FMM MFFM MM | MF FFMF F M | FMFFMMFMMFM 60 + FMFFFFFFMM MM F | F FMFFMFFMFMMM M | M MMFFMFMFIF M | FFFM FMMF M | FMMFFFM MF 40 + MFFFFMFMFF | MFFFMFMFMIIFFM | M MFFMFFFFF FI M | FFFFFFMMFF | FFFMFM MF 20 + F MFFFF I | FMFFIFMM | MMFFMFMFFF | MMMFM MF IM | FM MMF 0 + I F | ---+----------+----------+----------+----------+----------+----------+----------+----------+-------2 4 6 8 10 12 14 16 18 20 Lt NOTE: 1 obs had missing values. 424 obs hidden. Growth Curves fitted to Flier sunfish Analysis of Variance model or regression? Plot of ltmean*Age. Legend: A = 1 obs, B = 2 obs, etc. ltmean | | 18 + | | | | A | 16 + | | | A | | 14 + A | | A | | | 12 + | A | | | | 10 + | | | | | A 8 + | A | | | | 6 + | ---+-------------+-------------+-------------+-------------+-------------+-------------+-0 1 2 3 4 5 6 Age EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 13 87 PROC GLM DATA=flier; 88 TITLE2 'Analysis of Covariance with GLM - full model'; 89 TITLE3 'Model for testing hypotheses'; 90 class sex; MODEL LT = AGE AGE*AGE sex age*sex age*age*sex / solution; 91 RUN; 92 NOTE: The PROCEDURE GLM printed pages 23-24. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Analysis of Covariance with GLM - full model Model for testing hypotheses The GLM Procedure Class Level Information Class Levels Values sex 3 F I M Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.622957 Sum of Squares 2548.521903 1542.485808 4091.007711 DF 8 655 663 Coeff Var 12.71224 Root MSE 1.534581 Mean Square 318.565238 2.354940 F Value 135.28 Pr > F <.0001 Lt Mean 12.07169 Source Age Age*Age sex Age*sex Age*Age*sex DF 1 1 2 2 2 Type I SS 2410.841612 94.481241 35.999098 2.129152 5.070800 Mean Square 2410.841612 94.481241 17.999549 1.064576 2.535400 F Value 1023.74 40.12 7.64 0.45 1.08 Pr > F <.0001 <.0001 0.0005 0.6365 0.3413 Source Age Age*Age sex Age*sex Age*Age*sex DF 1 1 2 2 2 Type III SS 155.1938473 11.8008211 5.0304150 3.5478316 5.0708000 Mean Square 155.1938473 11.8008211 2.5152075 1.7739158 2.5354000 F Value 65.90 5.01 1.07 0.75 1.08 Pr > F <.0001 0.0255 0.3443 0.4712 0.3413 Parameter Intercept Age Age*Age sex sex sex Age*sex Age*sex Age*sex Age*Age*sex Age*Age*sex Age*Age*sex Estimate 6.780247116 2.882413458 -0.230214010 -0.742414051 -0.151576193 0.000000000 0.153920550 -0.986079544 0.000000000 -0.015313950 0.264988628 0.000000000 F I M F I M F I M B B B B B B B B B B B B Standard Error 0.30337376 0.24302606 0.04644858 0.50976581 0.92523435 . 0.38662502 0.91452373 . 0.07094939 0.18954686 . t Value 22.35 11.86 -4.96 -1.46 -0.16 . 0.40 -1.08 . -0.22 1.40 . Pr > |t| <.0001 <.0001 <.0001 0.1458 0.8699 . 0.6907 0.2813 . 0.8292 0.1626 . EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 14 NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. 93 PROC GLM DATA=flier; 94 TITLE2 'Analysis of Covariance with GLM - reduced model'; 95 TITLE3 'Model for testing hypotheses'; 96 class sex; MODEL LT = AGE AGE*AGE sex / solution; 97 RUN; 98 NOTE: The PROCEDURE GLM printed pages 25-26. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds Growth Curves fitted to Flier sunfish Analysis of Covariance with GLM - reduced model Model for testing hypotheses The GLM Procedure Class Level Information Class Levels Values sex 3 F I M Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.621197 Coeff Var 12.70314 DF 4 659 663 Sum of Squares 2541.321951 1549.685760 4091.007711 Root MSE 1.533484 Mean Square 635.330488 2.351572 F Value 270.17 Pr > F <.0001 Lt Mean 12.07169 Source Age Age*Age sex DF 1 1 2 Type I SS 2410.841612 94.481241 35.999098 Mean Square 2410.841612 94.481241 17.999549 F Value 1025.20 40.18 7.65 Pr > F <.0001 <.0001 0.0005 Source Age Age*Age sex DF 1 1 2 Type III SS 589.0744995 103.5703226 35.9990980 Mean Square 589.0744995 103.5703226 17.9995490 F Value 250.50 44.04 7.65 Pr > F <.0001 <.0001 0.0005 Parameter Estimate Intercept 6.701903631 B Age 2.902228054 Age*Age -0.226428768 sex F -0.463248933 B sex I -0.563496760 B sex M 0.000000000 B NOTE: The X'X matrix has been found used to solve the normal equations. 'B' are not uniquely estimable. Standard Error t Value Pr > |t| 0.23989494 27.94 <.0001 0.18336884 15.83 <.0001 0.03411875 -6.64 <.0001 0.12184297 -3.80 0.0002 0.37418652 -1.51 0.1326 . . . to be singular, and a generalized inverse was Terms whose estimates are followed by the letter EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 15 99 PROC GLM DATA=flier; 100 TITLE2 'Analysis of Covariance with GLM - full model'; 101 TITLE3 'Model for estimating parameters'; 102 class sex; MODEL LT = sex age*sex age*age*sex / solution noint; 103 RUN; NOTE: Due to the presence of CLASS variables, an intercept is implicitly fitted. Square has been corrected for the mean. NOTE: The PROCEDURE GLM printed pages 27-28. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.00 seconds R- Growth Curves fitted to Flier sunfish Analysis of Covariance with GLM - full model Model for estimating parameters The GLM Procedure Class Level Information Class Levels Values sex 3 F I M Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error DF 9 655 Sum of Squares 99310.3342 1542.4858 Uncorrected Total 664 100852.8200 R-Square 0.622957 Coeff Var 12.71224 Root MSE 1.534581 Mean Square 11034.4816 2.3549 F Value 4685.67 Pr > F <.0001 Lt Mean 12.07169 Source sex Age*sex Age*Age*sex DF 3 3 3 Type I SS 96813.37208 2389.67234 107.28977 Mean Square 32271.12403 796.55745 35.76326 F Value 13703.6 338.25 15.19 Pr > F <.0001 <.0001 <.0001 Source sex Age*sex Age*Age*sex DF 3 3 3 Type III SS 1823.270098 582.288146 107.289769 Mean Square 607.756699 194.096049 35.763256 F Value 258.08 82.42 15.19 Pr > F <.0001 <.0001 <.0001 Parameter sex sex sex Age*sex Age*sex Age*sex Age*Age*sex Age*Age*sex Age*Age*sex Estimate 6.037833064 6.628670923 6.780247116 3.036334008 1.896333914 2.882413458 -0.245527960 0.034774618 -0.230214010 Standard Error 0.40966516 0.87408407 0.30337376 0.30069460 0.88164164 0.24302606 0.05363158 0.18376763 0.04644858 F I M F I M F I M t Value 14.74 7.58 22.35 10.10 2.15 11.86 -4.58 0.19 -4.96 Pr > |t| <.0001 <.0001 <.0001 <.0001 0.0318 <.0001 <.0001 0.8500 <.0001 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 16 105 PROC GLM DATA=flier; 106 TITLE2 'Analysis of Covariance with GLM - reduced model'; 107 TITLE3 'Model for estimating parameters'; 108 class sex; MODEL LT = age age*age sex / solution noint; 109 RUN; NOTE: Due to the presence of CLASS variables, an intercept is implicitly fitted. Square has been corrected for the mean. NOTE: The PROCEDURE GLM printed pages 29-30. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds R- Growth Curves fitted to Flier sunfish Analysis of Covariance with GLM - reduced model Model for estimating parameters The GLM Procedure Class Level Information Class Levels Values sex 3 F I M Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error DF 5 659 Sum of Squares 99303.1342 1549.6858 Uncorrected Total 664 100852.8200 R-Square 0.621197 Coeff Var 12.70314 Root MSE 1.533484 Mean Square 19860.6268 2.3516 F Value 8445.68 Pr > F <.0001 Lt Mean 12.07169 Source Age Age*Age sex DF 1 1 3 Type I SS 93020.39061 4414.02262 1868.72101 Mean Square 93020.39061 4414.02262 622.90700 F Value 39556.7 1877.05 264.89 Pr > F <.0001 <.0001 <.0001 Source Age Age*Age sex DF 1 1 3 Type III SS 589.074500 103.570323 1868.721008 Mean Square 589.074500 103.570323 622.907003 F Value 250.50 44.04 264.89 Pr > F <.0001 <.0001 <.0001 Parameter Age Age*Age sex F sex I sex M Estimate 2.902228054 -0.226428768 6.238654699 6.138406871 6.701903631 Standard Error 0.18336884 0.03411875 0.25229307 0.40509883 0.23989494 t Value 15.83 -6.64 24.73 15.15 27.94 Pr > |t| <.0001 <.0001 <.0001 <.0001 <.0001 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 17 111 PROC GLM DATA=flier; 112 TITLE2 'Analysis of Covariance with GLM - modified model'; 113 TITLE3 'Model for testing hypotheses'; 114 class sex; MODEL LT = AGE AGE*AGE age*sex age*age*sex sex / solution; 115 RUN; NOTE: The PROCEDURE GLM printed pages 31-32. NOTE: PROCEDURE GLM used (Total process time): real time 0.03 seconds cpu time 0.03 seconds Growth Curves fitted to Flier sunfish Analysis of Covariance with GLM - modified model Model for testing hypotheses The GLM Procedure Class Level Information Class Levels Values sex 3 F I M Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total Sum of Squares 2548.521903 1542.485808 4091.007711 DF 8 655 663 R-Square Coeff Var Root MSE 12.71224 1.534581 F Value 135.28 Pr > F <.0001 Lt Mean 0.622957 Mean Square 318.565238 2.354940 12.07169 Source Age Age*Age Age*sex Age*Age*sex sex DF 1 1 2 2 2 Type I SS 2410.841612 94.481241 25.139172 13.029463 5.030415 Mean Square 2410.841612 94.481241 12.569586 6.514731 2.515207 F Value 1023.74 40.12 5.34 2.77 1.07 Pr > F <.0001 <.0001 0.0050 0.0636 0.3443 Source Age Age*Age Age*sex Age*Age*sex sex DF 1 1 2 2 2 Type III SS 155.1938473 11.8008211 3.5478316 5.0708000 5.0304150 Mean Square 155.1938473 11.8008211 1.7739158 2.5354000 2.5152075 F Value 65.90 5.01 0.75 1.08 1.07 Pr > F <.0001 0.0255 0.4712 0.3413 0.3443 Parameter Intercept Age Age*Age Age*sex Age*sex Age*sex Age*Age*sex Age*Age*sex Age*Age*sex sex sex sex Estimate 6.780247116 2.882413458 -0.230214010 0.153920550 -0.986079544 0.000000000 -0.015313950 0.264988628 0.000000000 -0.742414051 -0.151576193 0.000000000 F I M F I M F I M B B B B B B B B B B B B Standard Error 0.30337376 0.24302606 0.04644858 0.38662502 0.91452373 . 0.07094939 0.18954686 . 0.50976581 0.92523435 . t Value 22.35 11.86 -4.96 0.40 -1.08 . -0.22 1.40 . -1.46 -0.16 . Pr > |t| <.0001 <.0001 <.0001 0.6907 0.2813 . 0.8292 0.1626 . 0.1458 0.8699 . EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 18 NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. 117 PROC GLM DATA=flier; 118 TITLE2 'Analysis of Covariance with GLM - reduced modified model'; 119 TITLE3 'Model for testing hypotheses'; 120 class sex; MODEL LT = AGE AGE*AGE age*sex / solution; 121 RUN; 122 NOTE: The PROCEDURE GLM printed pages 33-34. NOTE: PROCEDURE GLM used (Total process time): real time 0.03 seconds cpu time 0.03 seconds Growth Curves fitted to Flier sunfish Analysis of Covariance with GLM - reduced modified model Model for testing hypotheses The GLM Procedure Class Level Information Class Levels Values sex 3 F I M Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.618542 Coeff Var 12.74758 DF 4 659 663 Sum of Squares 2530.462025 1560.545686 4091.007711 Root MSE 1.538847 Mean Square 632.615506 2.368051 F Value 267.15 Pr > F <.0001 Lt Mean 12.07169 Source Age Age*Age Age*sex DF 1 1 2 Type I SS 2410.841612 94.481241 25.139172 Mean Square 2410.841612 94.481241 12.569586 F Value 1018.07 39.90 5.31 Pr > F <.0001 <.0001 0.0052 Source Age Age*Age Age*sex DF 1 1 2 Type III SS 533.0054424 100.0852978 25.1391722 Mean Square 533.0054424 100.0852978 12.5695861 F Value 225.08 42.26 5.31 Pr > F <.0001 <.0001 0.0052 Parameter Estimate Intercept 6.509740037 Age 2.938183918 B Age*Age -0.220593895 Age*sex F -0.142547937 B Age*sex I -0.123281143 B Age*sex M 0.000000000 B NOTE: The X'X matrix has been found used to solve the normal equations. 'B' are not uniquely estimable. Standard Error t Value Pr > |t| 0.23495652 27.71 <.0001 0.18425139 15.95 <.0001 0.03393156 -6.50 <.0001 0.04395880 -3.24 0.0012 0.15648853 -0.79 0.4311 . . . to be singular, and a generalized inverse was Terms whose estimates are followed by the letter EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 19 123 PROC GLM DATA=flier; 124 TITLE2 'Analysis of Covariance with GLM - reduced modified model'; 125 TITLE3 'Model for estimating parameters'; 126 class sex; MODEL LT = age*sex AGE*AGE / solution; 127 RUN; 128 NOTE: The PROCEDURE GLM printed pages 35-36. NOTE: PROCEDURE GLM used (Total process time): real time 0.01 seconds cpu time 0.01 seconds NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 0.94 seconds cpu time 0.57 seconds Growth Curves fitted to Flier sunfish Analysis of Covariance with GLM - reduced modified model Model for estimating parameters The GLM Procedure Class Level Information Class Levels Values sex 3 F I M Number of Observations Read Number of Observations Used 664 664 Dependent Variable: Lt Source Model Error Corrected Total R-Square 0.618542 DF 4 659 663 Coeff Var 12.74758 Sum of Squares 2530.462025 1560.545686 4091.007711 Root MSE 1.538847 Mean Square 632.615506 2.368051 F Value 267.15 Pr > F <.0001 Lt Mean 12.07169 Source Age*sex Age*Age DF 3 1 Type I SS 2430.376727 100.085298 Mean Square 810.125576 100.085298 F Value 342.11 42.26 Pr > F <.0001 <.0001 Source Age*sex Age*Age DF 3 1 Type III SS 603.6544427 100.0852978 Mean Square 201.2181476 100.0852978 F Value 84.97 42.26 Pr > F <.0001 <.0001 Parameter Intercept Age*sex F Age*sex I Age*sex M Age*Age Estimate 6.509740037 2.795635981 2.814902775 2.938183918 -0.220593895 Standard Error 0.23495652 0.18160671 0.24133384 0.18425139 0.03393156 t Value 27.71 15.39 11.66 15.95 -6.50 Pr > |t| <.0001 <.0001 <.0001 <.0001 <.0001 EXST 7025 – Biological Population Statistics II Analysis of Covariance and Lack of Fit Page 20 Growth Curves fitted to Flier sunfish Separate models run in proc reg sex=F Analysis of Variance Source Model Error Corrected Total Root MSE Dependent Mean Coeff Var DF 2 337 339 1.42264 12.02000 11.83563 Sum of Squares 1015.14537 682.05863 1697.20400 R-Square Adj R-Sq Mean Square 507.57269 2.02391 F Value 250.79 Pr > F <.0001 0.5981 0.5957 Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > |t| Intercept Age AGE2 1 1 1 6.03783 3.03633 -0.24553 0.37978 0.27876 0.04972 15.90 10.89 -4.94 <.0001 <.0001 <.0001 Mean Square 69.62589 6.62307 F Value 10.51 sex=I Analysis of Variance Source Model Error Corrected Total Root MSE Dependent Mean Coeff Var DF 2 15 17 2.57353 10.51111 24.48392 Sum of Squares 139.25178 99.34599 238.59778 R-Square Adj R-Sq Pr > F 0.0014 0.5836 0.5281 Parameter Estimates Variable Intercept Age AGE2 DF 1 1 1 Parameter Estimate 6.62867 1.89633 0.03477 sex=M Analysis of Variance Sum of Source Model Error Corrected Total Root MSE Dependent Mean Coeff Var DF 2 303 305 1.58487 12.22092 12.96852 Standard Error 1.46586 1.47854 0.30818 Mean Squares 1342.56496 761.08119 2103.64614 R-Square Adj R-Sq t Value 4.52 1.28 0.11 Pr > |t| 0.0004 0.2191 0.9117 Square 671.28248 2.51182 F Value 267.25 0.6382 0.6358 Parameter Estimates Variable Intercept Age AGE2 DF 1 1 1 Parameter Estimate 6.78025 2.88241 -0.23021 Standard Error 0.31332 0.25099 0.04797 t Value 21.64 11.48 -4.80 Pr > |t| <.0001 <.0001 <.0001 Pr > F <.0001 ...
View Full Document

Ask a homework question - tutors are online