04 Morphometrics

04 Morphometrics - EXST025 - Biological Population...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: EXST025 - Biological Population Statistics Page 1 APPLICATION OF MODELS TO MERISTIC (or MORPHOMETRIC) RELATIONSHIPS - MERISTIC refers to the geometric relation between body parts Examples archaeologists - calculate the height and weight of dead and extinct animals from bones and partial skeletons (based on meristic relationships from living relatives) marine biologists - predict (after the fact) the size of sharks involved in attacks from the curvature of the jaw from bite marks taxonomists - categorize and identify species based on the relative size of, and measurements between, anatomical structures fisheries - many applications - conversions from standard or fork length to standard length, width to length, or thoracic length to total length - size at previous ages can be evaluated from the relationship between scale size and fish length EXST025 - Biological Population Statistics FISHERIES APPLICATION - background 1) assume a relationship exists between fish length and scale length because the number of scales does not change as the fish grows, so scales grow to cover fish 2) assume a KEY SCALE will have a consistent relationship to length (PROFFIT, 1950 states that different scales have different relationships) Then a relationship can be fitted for a particular scale Sunfish Key scale areas Page 2 EXST025 - Biological Population Statistics Page 3 3) assume a scale annuli represents a past scale length at the time of annuli formation, and we know the age at annuli formation Then using the fitted scale relationship we can calculate the length of the fish at the time of annulus formation on the scale -- the focus is an indeterminate area with no circuli, the intercept may actually depend more on where the person reading the scale actually starts his/her measurement SCALE SHOWING FOCUS MODELS USED TO FIT LENGTH- SCALE RELATIONSHIP 1) DIRECT PROPORTION 2) SIMPLE LINEAR 3) LOG - LOG Lt = b! TSL Lt = b! + b" TSL Lt = b! TSLb" 4) Second ORDER POLYNOMIAL Lt = b! + b" TSL + b# TSL# 5) Third ORDER POLYNOMIAL Lt = b! + b" TSL + b# TSL# + b$ TSL$ EXST025 - Biological Population Statistics MODEL DERIVATION 1) DIRECT PROPORTION - this method will work with only 1 fish Ltt St = Lt TSL where Ltt = length at some previous time (t) -- UNKNOWN St = Scale length at some previous time (t) determined by a scale annulus measurement Lt = length at capture TSL = total scale length at capture for a given fish, Lt TSL is a constant for a particular fish (call it b" ) so let; Ltt St = b" represent the constant ratio and then Ltt = b" * St is a direct proportion model this describes the relationship between the fishes length and the length of the scale at any time (for a particular fish) this is also the form of the predictive equation Page 4 EXST025 - Biological Population Statistics EXAMPLE for a fish with TSL = 10 mm, Lt = 200 mm, then Lt/TSL = 20 if S" = 4 S# = 7 S$ = 9 Lt" = 4 * 20 =80 Lt# = 7 * 20 140 = Lt$ = 9 * 20 160 = the constant proportion (ratio) [b" ] is given by Lt TSL = b" With many observations of Lti and TSLi for many fish, preferably over a wide range of sizes, we may want an average of Ltt St = b" so, one way is to fit Lti = b" TSLi as a regression forced through the origin this fits the relationship, and estimates an average “b" " over all fish the relationship may be fitted by any one of a number of ways, we have discussed 4 methods of fitting a ratio - for the moment assume linear regression forced through the origin will be adequate (assume homogeneous variance, but examine residuals) - hardest assumption -- TSL measured without error Page 5 EXST025 - Biological Population Statistics GRAPHIC EXAMPLE OF BACK CALCULATION RELATIONSHIP BETWEEN TOTAL SCALE LENGTH AND FISH LENGTH AT THE TIME OF CAPTURE TSL is the scale length at capture, but if we have a good range of sizes, and a good description of the relationship, we can use the relationship (HOPEFULLY) to describe the SCALE SIZE AT SOME PREVIOUS TIME POTENTIAL MODELS TO DESCRIBE THE RELATIONSHIP (from literature) 1) DIRECT PROPORTION MODEL -- as a regression THE LINE DESCRIBED BY THIS RELATIONSHIP (a) passes through the origin, (b) has no curvature THIS IS THE SIMPLEST REGRESSION MODEL WHICH MAY BE ADEQUATE TO DESCRIBE THE RELATIONSHIP Page 6 EXST025 - Biological Population Statistics 2) SIMPLE LINEAR REGRESSION -- derivation a) tiny fish do not have scales 1) for FLIER SUNFISH, Squamation starts between 16 - 17 mm of size in an area and is completed in the head area at 32 mm length Page 7 EXST025 - Biological Population Statistics 2) suppose direct proportion is correct but that we should add a constant since squamation does not start at size zero In fact, the “intercept" may or may not be the size at squamation when scale growth starts it proceeds rapidly until the fish is covered often no fish of very small size (near size at squamation) c) this model has an intercept, so it is a simple linear regression model Lt = b! + b" (TSL) Page 8 EXST025 - Biological Population Statistics Page 9 e) Model derivation Lt = b" * TSL from our earlier derivation but an intercept correction is needed (call it b! , and let's consider that it represents the size at squamation or the size at which scales first form in the key scale area) Ltt St = Ltt - b! St Lt TSL = subtract b! from all length measurements Lt - b! TSL Ltt - b! = Lt - b! TSL where for a given fish the term St Lt-b! TSL is a constant b" for a given fish the estimated average over all fish, using regression, is Ltt - b! = b" * St Ltt = b! + b" *St where b! may or may not be size at squamation For FLIER SUNFISH; Conley and Witt (1966) said squamation starts at 16 17 mm in certain site, I used that site and got 16.45 mm as the intercept Draw your own conclusions EXST025 - Biological Population Statistics OTHER MODELS 3) POWER MODEL - essentially a direct proportion model with an allowance for simple curvature (still passes origin) a) variation of the direct proportion model Ltt = b! Sb" t so Ltt b St " = b! which also represents a constant ratio of sorts, but with an adjustment for curvature (power term: b" ) another representation would be Log(Ltt ) / Log(Sb" ) = Log(b! ) t so log(Ltt ) / b" * Log(St ) = Log(b! ) b) best fit for FLIER sunfish c) if b" = 1, then the equation reduces to a DIRECT PROPORTION model, and the line is not curved d) other statistical advantages -- “cures" non - homogeneous variance - THIS MAY BE USEFUL EVEN IF THE LINE IS NOT CURVED! Page 10 EXST025 - Biological Population Statistics Page 11 THE LINE DESCRIBED BY THIS RELATIONSHIP (a) passes through the origin, (b) has curvature there is “no intercept" in that the line passes through origin, actually has no intercept in the algebraic model, but does cross the Y axis at the origin b! is the angle of the curve ( slope) b" is the rate of curvature --though the line can be straight (no curve) if b" = 1 the Log - log model is given by Lt = b! TSLb" when fitted on total measurements Ltt = b! Stb" for back-calculation a) can be easily linearized by calculating log(Ltt ) = log(b! ) + b" * log(St ) EXST025 - Biological Population Statistics 4) SECOND ORDER POLYNOMIAL LTt = b! + b" St + b# S# t a) no good theoretical derivation for this model b) may fit well, but with extra degrees of freedom THE LINE DESCRIBED BY THIS RELATIONSHIP (a) does not necessarily pass through the origin, (b) has curvature the direction of curvature depends on sign of b# Page 12 EXST025 - Biological Population Statistics 5) THIRD ORDER POLYNOMIAL LTt = b! + b" St + b# S# + b$ S$ t t THE LINE DESCRIBED BY THIS RELATIONSHIP (a) does not necessarily pass through the origin, (b) has curvature can curve in other direction (has up to 2 curvatures, 1 inflection) Page 13 EXST025 - Biological Population Statistics Page 14 POLYNOMIALS a) regular regression assumptions apply since polynomials should not be extended outside their range, small fish are needed in the sample to get an adequate fit of the lower range for prediction b) intercept and each inflection (level of curvature) employ a new degree of freedom, so each is testable c) the model may (hopefully will) reduce to a 2nd order polynomial d) Polynomials in general 1) each addition term ALLOWS for a new inflection, but 2) addition terms may also fit changes in rate of curvature instead of inflection (can tell somewhat by sign on bi ) 3) WORTHLESS outside the range of OBSERVED values (DANGEROUS) 4) Useful curve fitting technique, but often no “biological" interpretation to the regression coefficients 5) The regular assumptions for multiple regression apply 6) since only one Xi value used this multiple regression can employ residual plot of ^ ei * Xi (instead of ei * Yi used for other multiple regressions). EXST025 - Biological Population Statistics STATISTICAL CONSIDERATIONS -- in finding the “best" model 1) NOTE THAT THE FIRST CONSIDERATIONS SHOULD BE BIOLOGICAL, NOT STATISTICAL - if you have a reason to expect a particular model then that model should be fitted. If it does not provide the “highest R# " this is not necessarily important a) does it fit biological theory better than other models? b) does it meet the statistical assumptions where the other models do not? c) is the model preferable for any biological reason, and does it provide nearly as good a fit as the other models? If so, USE IT! - remember, 1 or 2% off in the R is not particularly important d) examining the residual plots for the model fitted, this is at least as important as the R# Page 15 EXST025 - Biological Population Statistics Page 16 2) DIRECT PROPORTION model - simplest model a) only model possible for a single fish unless other information available (eg. the value of the intercept) b) cannot test for curvature or intercept, so arrive at this model by simplification of larger models (ie. the larger models may “reduce" to this model) 3) SIMPLE LINEAR REGRESSION model a) the intercept value can be tested (H! : "! = 0) and failure to reject implies the DIRECT PROPORTION model b) both of these model require standard set of assumptions about the regression line, - including homogeneous variance (additive error) 4) LOG-LOG REGRESSION model a) the slope can be tested (H! : "" = 1) and failure to reject implies the DIRECT PROPORTION model - this is a test for curvature b) non-homogeneity of the original data is implied by this model 5) How about SIMPLE LINEAR REGRESSION versus POWER REGRESSION a) no good clean test b) Check R# value c) Check for homogeneous variance d) hopefully only 1 will be better than direct proportion 6) If both an intercept and curvature are indicated POLYNOMIAL model (either quadratic or cubic) a) the intercept, linear trend and curvature can all be tested individually in this model b) the curve is not a “pretty" biologically as the simpler curve in the POWER model c) the curve should not be EXTENDED OUTSIDE the RANGE of the data EXST025 - Biological Population Statistics 7) MODELS (counting POLYNOMIALS as one model) DIRECT PROPORTION MODEL NO CURVE TSL PASS ORIGIN SIMPLE LINEAR REGRESSION NO CURVE TSL NOT PASS ORIGIN (has intercept) LOG - LOG REGRESSION Page 17 EXST025 - Biological Population Statistics CURVE TSL PASSES ORIGIN POLYNOMIAL REGRESSION CURVE TSL NOT PASS ORIGIN (has intercept) These basic models will fit a wide range of experimental situations Page 18 EXST025 - Biological Population Statistics Page 19 OTHER CONSIDERATIONS (MERISTIC RELATIONSHIPS) a) The independent variable (Xi ) is probably not measured without error - if we wish to test for instance b" = 1 for the POWER MODEL we probably have an UNDERESTIMATE of b" (Ricker 1973 Linear regression in fishery research, Can. J. Fish. Res. Bd. discusses situations and solutions) b) Recent authors (Carlander, 1981, Fisheries. Am Fish Soc) caution against the use of regression (which describes the relationship for the average fish - CARLANDER suggests using a fish by fish basis for the calculations in the form of a correction scale length entered into the regression FIRST FIT -Ltt = LT TSL St = b" * St or Ltt = b! + LT - b" TSL St = b! + b" * St where b may come from regression or prior knowledge - use regression to find best model (DP or SLR), THEN, for the back- calculations, instead of entering St into the equation to back-calculate, enter f * St where f = observed TSL estimated TSL this adjusts the TSL up or down proportional to the amount observed at the time of capture, presuming that fish which are relatively large or small relative to the average have always been so (throughout life), or at least was the same at the time of circuli formation (but what about normal seasonal or sexual variation?) - this adjustment is proportional to the residual EXST025 - Biological Population Statistics Page 20 - schematic of adjustment suggested by Carlander Scale Length Age1 Age2 Age3 @Capture Fish Length - note that if the adjustment “f" is 1.1 (or 110 %) then adjusting each scale by 1.1 assumes that the proportion is and has always been 110 % for that fish Consider: when can we average and when should we maintain individual fish relationships. 1) Does a single fish maintain a set growth rate relative to other fish throughout his life (for genetic reasons or other) 2) or does a fish vary his growth rate from year to year (or month to month) due to changing habitat, breeding condition, compensation or other). The truth is probably somewhere between the two possibilities (a combination of both). EXST025 - Biological Population Statistics must assume additive error for the linearized model (and therefore multiplicative error for the original data). Jerrold Zar cautions against automatically accepting this (Bioscience 18: 12, 1968) eg. LTi = b! TSLb" ei i Log(LTi ) = Log(b! ) + b" * Log(TSLi ) + Log(ei ) a fix up for non-homogeneous variance in original data (desirable or not?) d) LT = b! TSLb" if b" = 1, then Lt = b! TSL which is a direct proportion easily tested with a t-test H! : b" = 1 t = b" - 1 Sb " Page 21 EXST025 - Biological Population Statistics Page 22 SAS program for the models discussed 1) under appropriate DATA statement (as data INPUT or in a later DATA statement to get natural logarithms INPUT ... LT TSL ... ; LLT = LOG(LT); LTSL = LOG(TSL); to get power terms for polynomials we can use TSL2 = TSL * TSL; TSL3 = TSL * TSL2; though this is not necessary (see below) to get logarithms with base 10, change LOG to LOG10 eg. LLT = LOG10(LT); 2) to run the POWER model use PROC GLM; MODEL LLT = LTSL; 3) to obtain polynomial models use PROC GLM; MODEL LT = TSL TSL*TSL; EXST025 - Biological Population Statistics Page 23 ALLOMETRY - refers to the relative rates of growth of body parts eg. Width = "" * Length if Width is always 20% of length, then the equation is Width = 0.20 * Length but what if the SHAPE of the fish changes over time as it matures? eg. Width = 0.18 * Length for small fish Width = 0.20 * Length for medium fish Width = 0.22 * Length for larger fish so there is a gradual change in body proportions Width GRAPHICALLY 0.22 0.20 0.18 Length EXST025 - Biological Population Statistics Page 24 LENGTH - WEIGHT RELATIONSHIP a) always use the model Wi = "! Li"" %i Log(Wi ) = Log("! ) + "" * Log(Li ) + Log(%i ) b) in SAS INPUT ... LT WT ...; LWT = LOG(WT); LLT = LOG(LT); PROC GLM: MODEL LWT = LLT; c) statistically 1) all assumptions apply except, (a) multiplicative (non-homogeneous) error is implied for raw data because of the original model (b) Ricker (1973) discusses the fact that length is probably not measured without error EXST025 - Biological Population Statistics Page 25 d) INTERPRETATION OF COEFFICIENTS -- statistically we know what and are, what do they mean biologically 1) relation between length and weight is relating mm to gms - we need an adjustment of some sort (done by the units of the regression coefficient ) - except that there is no good biological interpretation to gms mm 2) however, we can fit a very nice relationship between grams and centimeters if we know the ... specific gravity eg. for water (pure) 1 cm = 1 gm. 3) so we may expect W = "! * L$ where is specific gravity, except that a fish is not shaped like a cube 4) imagine we are going to carve a fish from a cube such that the side of the cube equals the length of the fish EXST025 - Biological Population Statistics Page 26 eg. small cube p small fish large cube p large fish 5) if the fish keeps exactly the same shape all his/her life then he/she will occupy a certain fraction of the cube eg. 0.05 or 5% - This varies with fish, usually approximately 1% to 3% less for eel, more for boxfish EXST025 - Biological Population Statistics Page 27 6) so our model is Weight = (fraction of cube) * (specific gravity) * Length where; (fraction of cube) * (sp. gr.) = b! 7) Is the slope always “3"? a) maybe for fish, but ... - grass keeps same width and cell thickness so weight depends only on length, so if the "slope" = 1 (sausage) W = "! L" b) leaf maintains same cell thickness, increases in length and width, so we expect that the "slope" = 2 (bolts of cloth where instead of L2 it is length * width) W = "! L# 8) Some things do not grow in 3 dimensions most published values of “b" for fish are near 3, but some are as high as 6 (eg. post larval menhaden) Why? - metamorphosis of post larval menhaden - with little or no change in length, a fish becomes more full bodied the fish grows in 1 or 2 dimensions, and we are measuring the WRONG DIMENSION, any value of "" is possible changes from "" = 3 can also occur with changes in robustness (sex differences, health, etc.), since these often occur without a change in length 9) the use of the Weight - Length relationship is not restricted to fish - any organism may follow a similar pattern, except those which undergo metamorphosis this is called allometric growth - little elephants are recognizable because they look like big elephants EXST025 - Biological Population Statistics CONDITION FACTORS, PONDERAL INDICES A crude version is used in health studies, the version in “fisheries" more refined - (can be used for ANY ORGANISM) traditional form K= W L$ which is adjusted to 1 for the metric system - using length in cm. and weight in gms. will produce a number which is approximately 1 * 10& - by convention we adjust to 1, if units other than cm and gms are used (eg. mm and kg) then the adjustment differs essentially K ¸ b! = (fraction. of cube) * (specific gravity) assume specific gravity is approximately a constant at 1 Then b! is approximately the fraction of the cube occupied by fish increases for fat fish decreases for skinny fish therefore, the value may be used as an indicator of a) robustness, or condition (health) b) sexual condition (females with eggs) Traditionally, K analyzed with ANOVA to test for differences between AREAS, SEASONS, SEXES, AGES, HABITATS etc. Page 28 EXST025 - Biological Population Statistics Page 29 HOWEVER, THERE IS A PROBLEM WITH USING K IN ANOVA a) given, Wi = "! Li$ %i and Wi L$ i = "! * %i or Ki * %i we can see that the error is not additive, and this is an assumption in ANOVA b) frequently (Carlander, 1980) K is correlated (significantly) with length, which implies W = "! L$ L"" L$ W L$ = "! W L$ = "! L"" -3 , so "! is not free of LENGTH as it should be LE CREN (1951) suggests Ci = Wi " Li " = “%i " but this centers on “1" and we know the mean %i Á 1, actually Log(Wi ) = Log("! ) + Log(Li ) + Log(%i ) assume that Log(%i ) is distributed NID(0,5# ) Log(%i ) = Log(Wi ) - cLog("! ) + "" *log(Li )d LeCren's Ci = elog(%i ) = Wi Li has a log normal distribution, then log (%i ) centers on 0, and elog(%i ) centers on 1 EXST025 - Biological Population Statistics Page 30 LeCren recognized that the best solution is ANACOV, instead of ANOVA such as Ci = SEX AREA; use Log(Wi ) - cLog(b! ) + b" *(Log(Li ))d = SEX AREA; which has a log normal distribution of %i (supposedly), normalized by taking logarithms the model to use is then Log(Wijk ) = Log(b! ) + b" * Log(Li ) + SEXj + AREAk + Log(%ijk ) EXST025 - Biological Population Statistics Page 31 ADVANTAGES a) log(%i ) is still NID (0,5# ) b) b" is not fixed when the class variables are considered, this is advantageous because suppose IF b" fitted before SEX Y X THE WRONG SLOPE IS CALCULATED BUT IF b" fitted simultaneously Y X CORRECT SLOPE and SEPARATE INTERCEPTS EXST025 - Biological Population Statistics Page 32 DISADVANTAGE - more complicated AND, not a complete ANACOV, a) with this approach, shouldn't we really be testing SEPARATE SLOPES as well? simpler just to let K = W L$ b) we lose the “condition factor" concept, what do we point to as a K value? - remember in ANACOV SEPARATE LEVELS with SAS REPARAMETERIZATION so the log(b! ) will be the logarithm of K (the intercept) for missing SEX and/or AREA but without a common b, are the intercepts meaningless? - the “regression coefficients" for the categorical variables are the differences between log(b! ) for other categories c) REMEMBER SCALE LENGTH - BODY LENGTH RELATIONSHIP - since fish can enlarge scales and reabsorb scales on the edges, that which affects robustness will affect (I BELIEVE) the back calculation formulas - this is why I am not so sure that averaging across all fish, particularly when captured at different times of the year, and in different stages of sexual maturity, is necessarily a bad idea (can we assume a fish will have consistently large or small scales given their ability to adjust scales to robustness?) EXST 7025 Morphometrics Page 1 1 /*********************************************************************/ 2 /*** Geaghan MS Thesis Flier data ***/ 3 /*********************************************************************/ 4 dm'log;clear;output;clear'; 5 OPTIONS NOCENTER PS=52 LS=111 NODATE NONUMBER nolabel; 6 ODS HTML style=minimal body='C:\Geaghan\Current\EXST7025\Spring2008\Flier\LtWt\Lt_Wt01.html' ; NOTE: Writing HTML Body file: C:\Geaghan\Current\EXST7025\Spring2008\Flier\LtWt\Lt_Wt01.html 7 8 ods graphics ON; NOTE: ODS Statistical Graphics will require a SAS/GRAPH license when it is declared production. 9 10 LIBNAME SASDATA 'C:\Geaghan\Current\EXST7025\Spring2008\Flier\'; NOTE: Libref SASDATA was successfully assigned as follows: Engine: V9 Physical Name: C:\Geaghan\Current\EXST7025\Spring2008\Flier 11 FILENAME INPUT 'C:\Geaghan\Current\EXST7025\Spring2008\Flier\ONEper.csv'; 12 TITLE1 'North Carolina Flier sunfish data'; 13 14 DATA Flier; infile input missover delimiter="," firstobs=2; 15 input Mo Day Yr Ar St Md sx sex $ Sn Dayno Age Age_Days Lt Wt TSL 16 Size1 Size2 Size3 Size4 Size5 Size6 Edge EdgeGrow K FNO; 17 if lt gt 14.0 and wt lt 20 then wt = .; 18 IF LT LE 0 THEN DELETE; 19 IF WT LE 0 THEN DELETE; 20 IF TSL LE 0 THEN DELETE; 21 LLT= LOG(LT); 22 LWT= LOG(WT); 23 LTSL= LOG(TSL); 24 LT2 = LT*LT; LT3 = LT*LT*LT; 25 RUN; NOTE: The infile INPUT is: File Name=C:\Geaghan\Current\EXST7025\Spring2008\Flier\ONEper.csv, RECFM=V,LRECL=256 NOTE: 664 records were read from the infile INPUT. The minimum record length was 66. The maximum record length was 114. NOTE: The data set WORK.FLIER has 658 observations and 30 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 25 ! ; 26 27 proc plot data=flier; plot Lt * TSL = age; run; 28 NOTE: There were 658 observations read from the data set WORK.FLIER. NOTE: The PROCEDURE PLOT printed page 1. NOTE: PROCEDURE PLOT used (Total process time): real time 0.14 seconds cpu time 0.03 seconds EXST 7025 Morphometrics Page 2 North Carolina Flier sunfish data Plot of Lt*TSL. Symbol is value of Age. Lt | | 20 + | | | 6 18 + 3 6 | 3 4 4 | 63 4 | 3353 4 16 + 3 2 33 33 343 3 | 3 6 4 43632335 | 4 22232234342 444 | 2 34223422 33 333 4 14 + 3333332233234234 3 4 | 1 32232225232334 44444 3 | 2 433 3233222223 2 4 3 4 | 22 222223322 3233344 4 12 + 22222222222222 2 | 2 2122221223 3 | 2322322211232 | 2 2 12121223 2 3 10 + 0 02211121222 2 | 1 1111122111 | 11 131122112 | 111111 8 + 11111111 | 1 1101 11 | 0 10010 | 1 00 01 6 + 1 0001 11 | 10 | | 4 + | 0 | | 2 + | --+------------+------------+------------+------------+------------+------------+------------+-----50 100 150 200 250 300 350 400 TSL NOTE: 388 obs hidden. 29 PROC REG DATA=FLIER lineprinter; 30 TITLE2 'Total Scale Length - Body Length relationship (SLR)'; 31 MODEL LT = TSL; 32 plot residual.*lt; 33 RUN; WARNING: Statistical graphics displays created with ODS are experimental in this release. NOTE: The PROCEDURE REG printed pages 2-3. NOTE: PROCEDURE REG used (Total process time): real time 5.78 seconds cpu time 4.78 seconds North Carolina Flier sunfish data Total Scale Length - Body Length relationship (SLR) The REG Procedure Model: MODEL1 Dependent Variable: Lt Number of Observations Read Number of Observations Used 658 658 EXST 7025 Morphometrics Page 3 Analysis of Variance Source Model Error Corrected Total DF 1 656 657 Root MSE Dependent Mean Coeff Var 0.69603 12.08100 5.76136 Sum of Squares 3691.38857 317.80397 4009.19254 R-Square Adj R-Sq Mean Square 3691.38857 0.48446 F Value 7619.64 0.9207 0.9206 Parameter Estimates Variable Intercept TSL DF 1 1 Parameter Estimate 1.65021 0.04172 Standard Error 0.12254 0.00047790 t Value 13.47 87.29 Pr > |t| <.0001 <.0001 Pr > F <.0001 EXST 7025 Morphometrics Page 4 34 PROC REG DATA=FLIER lineprinter; 35 TITLE2 'Total Scale Length - Body Length relationship (Power)'; 36 MODEL LLT = LTSL / clb; 37 plot residual.*Llt; 38 test LTSL = 1; 39 RUN; WARNING: Statistical graphics displays created with ODS are experimental in this release. 40 NOTE: The PROCEDURE REG printed pages 4-6. NOTE: PROCEDURE REG used (Total process time): real time 2.62 seconds cpu time 1.59 seconds North Carolina Flier sunfish data Total Scale Length - Body Length relationship (Power) The REG Procedure Model: MODEL1 Dependent Variable: LLT Number of Observations Read Number of Observations Used 658 658 Analysis of Variance Source Model Error Corrected Total DF 1 656 657 Sum of Squares 31.92806 1.99665 33.92471 Mean Square 31.92806 0.00304 F Value 10490.0 Pr > F <.0001 EXST 7025 Root MSE Dependent Mean Coeff Var Morphometrics 0.05517 2.46783 2.23554 Parameter Estimates Parameter Variable DF Estimate Intercept 1 -2.16788 LTSL 1 0.84423 R-Square Adj R-Sq Standard Error 0.04531 0.00824 Page 5 0.9411 0.9411 t Value -47.84 102.42 Test 1 Results for Dependent Variable LLT Mean Source DF Square F Value Numerator 1 1.08691 357.10 Denominator 656 0.00304 Pr > |t| <.0001 <.0001 Pr > F <.0001 95% Confidence Limits -2.25685 -2.07890 0.82805 0.86042 EXST 7025 Morphometrics Page 6 41 proc plot data=flier; plot wt * lt = age; run; NOTE: There were 658 observations read from the data set WORK.FLIER. NOTE: The PROCEDURE PLOT printed page 7. NOTE: PROCEDURE PLOT used (Total process time): real time 0.12 seconds cpu time 0.03 seconds Plot of Wt*Lt. Symbol is value of Age. Wt | 140 + 3 | 44 | 3 6 | 3 | 6 120 + | | 34 64 | 33 | 23 100 + 4 5 | 5 5 333 | 655 4 | 4 2 4 43 33 | 42442 33 80 + 4 3 23 333 | 23332532343 | 322 3233 23 | 23 3342 2 4 | 23423324333 60 + 4432333232 43 3 | 4 223223333233 3 | 2 3232332233 3 | 3222 2233 3 | 3222222 23 40 + 1222223323 | 22222213323333 | 2 222121222 33 1 | 2212222232 | 120012 22 20 + 2 11111 2 | 11111111 | 1011111112 | 00000 01 11 | 0 001 0 + 0 ---+----------+----------+----------+----------+----------+----------+----------+----------+-----2 4 6 8 10 12 14 16 18 Lt NOTE: 421 obs hidden. EXST 7025 Morphometrics Page 7 43 PROC REG DATA=FLIER lineprinter; 44 TITLE2 'Length - weight relationship (Cubic)'; 45 MODEL WT = LT LT2 LT3; 46 plot residual.*lt; 47 RUN; 48 NOTE: The PROCEDURE REG printed pages 8-9. NOTE: PROCEDURE REG used (Total process time): real time 2.14 seconds cpu time 1.17 seconds North Carolina Flier sunfish data Length - weight relationship (Cubic) The REG Procedure Model: MODEL1 Dependent Variable: Wt Number of Observations Read Number of Observations Used 658 658 Analysis of Variance Source Model Error Corrected Total DF 3 654 657 Root MSE Dependent Mean Coeff Var 6.16317 45.49012 13.54836 Sum of Squares 377321 24842 402163 R-Square Adj R-Sq Mean Square 125774 37.98463 F Value 3311.17 0.9382 0.9379 Parameter Estimates Variable Intercept Lt LT2 LT3 DF 1 1 1 1 Parameter Estimate -18.17873 5.61158 -0.50812 0.03697 Standard Error 11.33560 3.12452 0.27746 0.00797 t Value -1.60 1.80 -1.83 4.64 Pr > |t| 0.1093 0.0730 0.0675 <.0001 Pr > F <.0001 EXST 7025 Morphometrics Page 8 49 PROC REG DATA=FLIER lineprinter; 50 TITLE2 'Length - weight relationship (Power)'; 51 MODEL LWT = LLT / CLB; 52 test LLt = 3; 53 plot residual.*llt; 54 output out=next1 r=e p=yhat RSTUDENT=RSTUDENT; 55 RUN; WARNING: Statistical graphics displays created with ODS are experimental in this release. NOTE: The data set WORK.NEXT1 has 658 observations and 33 variables. NOTE: The PROCEDURE REG printed pages 10-12. NOTE: PROCEDURE REG used (Total process time): real time 2.70 seconds cpu time 1.65 seconds EXST 7025 Morphometrics Page 9 North Carolina Flier sunfish data Length - weight relationship (Power) The REG Procedure Model: MODEL1 Dependent Variable: LWT Number of Observations Read Number of Observations Used 658 658 Analysis of Variance Sum of Squares 278.36136 9.76532 288.12668 Source Model Error Corrected Total DF 1 656 657 Root MSE Dependent Mean Coeff Var 0.12201 3.63850 3.35328 Parameter Estimates Parameter Variable DF Estimate Intercept 1 -3.43057 LLT 1 2.86448 R-Square Adj R-Sq Standard Error 0.05191 0.02095 Mean Square 278.36136 0.01489 F Value 18699.3 Pr > F <.0001 0.9661 0.9661 t Value -66.08 136.75 Test 1 Results for Dependent Variable LWT Source DF Mean Square F Value Numerator 1 0.62301 41.85 Denominator 656 0.01489 Pr > |t| <.0001 <.0001 Pr > F <.0001 95% Confidence Limits -3.53251 -3.32864 2.82335 2.90562 EXST 7025 Morphometrics 56 data next1a; set next1; 57 if rstudent gt -2 and rstudent lt 2 then delete; run; NOTE: There were 658 observations read from the data set WORK.NEXT1. NOTE: The data set WORK.NEXT1A has 21 observations and 33 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 58 proc sort data=next1a; by RSTUDENT; run; NOTE: There were 21 observations read from the data set WORK.NEXT1A. NOTE: The data set WORK.NEXT1A has 21 observations and 33 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 59 proc print data=next1a; var fno lt wt e rstudent; run; NOTE: There were 21 observations read from the data set WORK.NEXT1A. NOTE: The PROCEDURE PRINT printed page 13. NOTE: PROCEDURE PRINT used (Total process time): real time 0.09 seconds cpu time 0.03 seconds Page 10 EXST 7025 Morphometrics North Carolina Flier sunfish data Length - weight relationship (Power) Obs FNO Lt Wt e 1 21 5.5 2.3 -0.61974 2 115 13.5 33.2 -0.52224 3 561 15.7 55.1 -0.44810 4 56 8.0 8.1 -0.43409 5 198 11.2 22.8 -0.36301 6 196 11.1 22.6 -0.34613 7 24 6.2 4.4 -0.31422 8 164 10.3 18.9 -0.31066 9 449 12.9 36.6 -0.29452 10 378 12.2 31.9 -0.27215 11 403 12.3 32.7 -0.27076 12 199 11.2 25.2 -0.26293 13 375 12.2 32.3 -0.25968 14 231 11.9 30.1 -0.25891 15 434 12.7 36.3 -0.25799 16 536 14.6 54.2 -0.25649 17 338 11.5 27.7 -0.24406 . . . 18 637 14.4 87.7 0.26426 19 229 11.8 50.0 0.27276 20 586 12.6 60.5 0.27548 21 139 9.8 33.7 0.41022 Page 11 RSTUDENT -5.22958 -4.34270 -3.71563 -3.60093 -2.99574 -2.85471 -2.60461 -2.55956 -2.42499 -2.23909 -2.22763 -2.16278 -2.13584 -2.12937 -2.12193 -2.11078 -2.00650 2.17499 2.24417 2.26685 3.39337 How many are expected to exceed 2? About 5%, so for 658 observations that would be about 33 obs (actually for "2" alpha is 0.0459 so we expect 31). Using a Bonferroni adjustment, the probability is 0.05 / 658 = 0.0000759878, and the t value is 3.982. So for 658 tests we would reject when t > 3.982 with a 5% chance of error overall for all tests together, jointly. 61 OPTIONS PS=512 LS=111; 62 PROC REG DATA=FLIER; 63 TITLE2 'Length - weight relationship (Power alternative)'; 64 MODEL LWT = LLT; 65 restrict lLt = 3; 66 RUN; 67 NOTE: The PROCEDURE REG printed page 14. NOTE: PROCEDURE REG used (Total process time): real time 2.67 seconds cpu time 1.40 seconds North Carolina Flier sunfish data Length - weight relationship (Power alternative) The REG Procedure Model: MODEL1 Dependent Variable: LWT NOTE: Restrictions have been applied to parameter estimates. Number of Observations Read Number of Observations Used 658 658 Analysis of Variance Source Model Error Corrected Total DF 0 657 657 Sum of Squares 277.73835 10.38833 288.12668 Mean Square . 0.01581 F Value . Pr > F . EXST 7025 Morphometrics Root MSE Dependent Mean Coeff Var 0.12574 3.63850 3.45596 R-Square Adj R-Sq Page 12 0.9639 0.9639 Parameter Estimates Variable Intercept LLT RESTRICT DF 1 1 -1 Parameter Estimate -3.76500 3.00000 -4.59734 Standard Error 0.00490 0 0.73240 t Value -768.05 Infty -6.28 Pr > |t| <.0001 <.0001 <.0001* * Probability computed using beta distribution. exp(-3.765) = 0.023167612, or 2.3% 68 PROC MIXED DATA=FLIER; TITLE2 'Length - weight relationship (ANCOVA tests)'; 69 class ar st; 70 MODEL LWT = LLT Ar LLt*Ar / solution cl; 71 random st(ar) llt*St(Ar); 72 RUN; NOTE: Convergence criteria met. NOTE: The PROCEDURE MIXED printed page 15. NOTE: PROCEDURE MIXED used (Total process time): real time 0.21 seconds cpu time 0.09 seconds North Carolina Flier sunfish data Length - weight relationship (ANCOVA tests) The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.FLIER LWT Variance Components REML Profile Model-Based Containment Class Level Information Class Levels Values Ar 3 1 2 3 St 7 1 2 3 4 5 6 7 Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Number Number Number Number of of of of Observations Observations Read Observations Used Observations Not Used 3 8 28 1 658 658 658 0 EXST 7025 Morphometrics Iteration History Iteration Evaluations 0 1 1 3 2 2 3 2 4 1 5 1 Convergence criteria met. -2 Res Log Like -916.40341166 -1083.64456426 -1084.86561114 -1085.19148930 -1085.21960298 -1085.21994476 Page 13 Criterion . 0.00024548 0.00002307 0.00000030 0.00000000 Covariance Parameter Estimates Cov Parm Estimate St(Ar) 0.002470 LLT*St(Ar) 0.000486 Residual 0.01031 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) -1085.2 -1079.2 -1079.2 -1077.3 Solution for Fixed Effects Effect Intercept LLT Ar Ar Ar LLT*Ar LLT*Ar LLT*Ar Ar 1 2 3 1 2 3 Estimate -3.5798 2.9447 -0.2123 0.2551 0 0.04245 -0.1222 0 Standard Error 0.2087 0.08489 0.2417 0.2178 . 0.09758 0.08870 . DF 11 9 11 11 . 9 9 . Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value LLT 1 9 7526.99 Ar 2 11 6.10 LLT*Ar 2 9 5.03 t Value -17.15 34.69 -0.88 1.17 . 0.44 -1.38 . Pr > |t| <.0001 <.0001 0.3984 0.2662 . 0.6738 0.2017 . Alpha 0.05 0.05 0.05 0.05 . 0.05 0.05 . Lower -4.0392 2.7527 -0.7442 -0.2243 . -0.1783 -0.3228 . Upper -3.1205 3.1368 0.3196 0.7345 . 0.2632 0.07848 . Pr > F <.0001 0.0165 0.0342 74 PROC MIXED DATA=FLIER; TITLE2 'Length - weight relationship (ANCOVA estimates)'; 75 class ar st; 76 MODEL LWT = Ar LLt*Ar / solution cl noint; 77 random st(ar) llt*St(Ar); 78 RUN; NOTE: Convergence criteria met. NOTE: The PROCEDURE MIXED printed page 16. NOTE: PROCEDURE MIXED used (Total process time): real time 0.25 seconds cpu time 0.06 seconds North Carolina Flier sunfish data Length - weight relationship (ANCOVA estimates) The Mixed Procedure Model Information Data Set Dependent Variable WORK.FLIER LWT EXST 7025 Morphometrics Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method Page 14 Variance Components REML Profile Model-Based Containment Class Level Information Class Levels Values Ar 3 1 2 3 St 7 1 2 3 4 5 6 7 Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Number Number Number Number of of of of 3 6 28 1 658 Observations Observations Read Observations Used Observations Not Used Iteration History Iteration Evaluations 0 1 1 3 2 2 3 2 4 1 5 1 Convergence criteria met. 658 658 0 -2 Res Log Like -916.40341165 -1083.64456425 -1084.86561114 -1085.19148930 -1085.21960297 -1085.21994475 Criterion . 0.00024548 0.00002307 0.00000030 0.00000000 Covariance Parameter Estimates Cov Parm Estimate St(Ar) 0.002470 LLT*St(Ar) 0.000486 Residual 0.01031 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) -1085.2 -1079.2 -1079.2 -1077.3 Solution for Fixed Effects Effect Ar Ar Ar LLT*Ar LLT*Ar LLT*Ar Ar 1 2 3 1 2 3 Estimate -3.7922 -3.3247 -3.5798 2.9872 2.8225 2.9447 Standard Error 0.1218 0.06236 0.2087 0.04811 0.02571 0.08489 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Ar 3 11 1368.45 LLT*Ar 3 9 5704.44 DF 11 11 11 9 9 9 t Value -31.12 -53.32 -17.15 62.09 109.79 34.69 Pr > F <.0001 <.0001 Pr > |t| <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 Alpha 0.05 0.05 0.05 0.05 0.05 0.05 Lower -4.0603 -3.4620 -4.0392 2.8783 2.7644 2.7527 Upper -3.5240 -3.1875 -3.1205 3.0960 2.8807 3.1368 EXST 7025 Morphometrics Page 15 80 PROC MIXED DATA=FLIER; TITLE2 'Length-weight (reduced ANCOVA estimates)'; 81 class ar st; 82 MODEL LWT = Ar LLt / solution cl noint; 83 random st(ar) llt*St(Ar); 84 RUN; NOTE: Convergence criteria met. NOTE: The PROCEDURE MIXED printed page 17. NOTE: PROCEDURE MIXED used (Total process time): real time 0.29 seconds cpu time 0.15 seconds NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 18.29 seconds cpu time 11.62 seconds North Carolina Flier sunfish data Length - weight relationship (reduced ANCOVA estimates) The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.FLIER LWT Variance Components REML Profile Model-Based Containment Class Level Information Class Levels Values Ar 3 1 2 3 St 7 1 2 3 4 5 6 7 Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Number Number Number Number of of of of 3 4 28 1 658 Observations Observations Read Observations Used Observations Not Used Iteration History Iteration Evaluations 0 1 1 3 2 2 3 1 4 1 5 1 Convergence criteria met. 658 658 0 -2 Res Log Like -903.64238448 -1081.25456018 -1082.23502395 -1082.40237340 -1082.41545464 -1082.41561757 Covariance Parameter Estimates Cov Parm Estimate St(Ar) 0.002984 LLT*St(Ar) 0.000674 Residual 0.01039 Criterion 0.00072261 0.00012736 0.00001072 0.00000014 0.00000000 EXST 7025 Morphometrics Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Page 16 -1082.4 -1076.4 -1076.4 -1074.5 Solution for Fixed Effects Effect Ar Ar Ar LLT Ar 1 2 3 Estimate -3.5154 -3.4152 -3.3933 2.8672 Standard Error 0.07155 0.05723 0.06972 0.02266 DF 11 11 11 11 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Ar 3 11 1320.50 LLT 1 11 16012.6 t Value -49.13 -59.68 -48.67 126.54 Pr > |t| <.0001 <.0001 <.0001 <.0001 Alpha 0.05 0.05 0.05 0.05 Lower -3.6729 -3.5412 -3.5467 2.8173 Upper -3.3579 -3.2893 -3.2398 2.9170 Pr > F <.0001 <.0001 Plot of Resid*Lt. Symbol is value of Age. Resid | | 0.4 + 2 | | 1 | | | 2 2 34 3 0.2 + 1 1 2 2 2 3 5 3 | 0 2 1 0 32 132 43423 34 4 | 00 1 12 222 22 23 333323 2333 44 4 5 255 34 | 1 0 1 1 1 2 2 32 222 22232324233 332 5 33 | 11 1 11 2 2022122 2322 232 23332322 2 42 43 45 3 | 010 0 1111 1112 11 2212222223222323 32234 33 3233 44 3 3 6 0.0 +----------------------------00-1-111-----12211-1322-2123223222322232-3-334-4-33-43--------------| 0 0 1 1 1 11212212 2 332222 222422433 243 236 3 3 4 6 6 | 0 1 1 1 1 112 21 2 21222 22 23343323 3 3 3 333 | 1 1 1 21 22 1 2323 3223 242 33232 233 | 11 1 1 2 23 23333 324 4333 3 3 4 | 1 22 3 233 33 4343 -0.2 + 0 1 1 2 3 3 4 | 1 2 3 4 3 | 1 22 2 3 | | | -0.4 + | | | | 1 1 3 | -0.6 + | ---+----------+----------+----------+----------+----------+----------+----------+----------+-----2 4 6 8 10 12 14 16 18 Lt NOTE: 290 obs hidden. Histogram 0.425+* . .* . .** .******* .**************** .************************* .************************************* .********************************* .************************ .************** .****** .** .** . . . . -0.525+* ----+----+----+----+----+----+----+-* may represent up to 4 counts Tests for Normality Test Shapiro-Wilk --Statistic--W 0.966697 # 1 Boxplot 0 1 0 7 25 62 100 147 130 93 54 21 8 6 3 | | | +-----+ *--+--* | | +-----+ | | 0 0 * -----p Value-----Pr < W <0.0001 ...
View Full Document

Ask a homework question - tutors are online