This preview shows page 1. Sign up to view the full content.
Unformatted text preview: EXST025  Biological Population Statistics Page 1 APPLICATION OF MODELS TO MERISTIC (or MORPHOMETRIC)
RELATIONSHIPS
 MERISTIC refers to the geometric relation between body parts
Examples archaeologists  calculate the height and weight of dead and extinct animals
from bones and partial skeletons (based on meristic relationships from
living relatives)
marine biologists  predict (after the fact) the size of sharks involved in
attacks from the curvature of the jaw from bite marks
taxonomists  categorize and identify species based on the relative size of, and
measurements between, anatomical structures
fisheries  many applications
 conversions from standard or fork length to standard length, width to length, or
thoracic length to total length
 size at previous ages can be evaluated from the
relationship between scale size and fish length EXST025  Biological Population Statistics FISHERIES APPLICATION  background
1) assume a relationship exists between fish length and scale length because the
number of scales does not change as the fish grows, so scales grow to
cover fish
2) assume a KEY SCALE will have a consistent relationship to length (PROFFIT,
1950 states that different scales have different relationships)
Then a relationship can be fitted for a particular scale Sunfish Key scale areas Page 2 EXST025  Biological Population Statistics Page 3 3) assume a scale annuli represents a past scale length at the time of annuli
formation, and we know the age at annuli formation
Then using the fitted scale relationship we can calculate the length of the fish
at the time of annulus formation
on the scale  the focus is an indeterminate area with no circuli, the intercept
may actually depend more on where the person reading the scale actually
starts his/her measurement SCALE SHOWING FOCUS MODELS USED TO FIT LENGTH SCALE RELATIONSHIP
1) DIRECT PROPORTION
2) SIMPLE LINEAR
3) LOG  LOG Lt = b! TSL Lt = b! + b" TSL Lt = b! TSLb" 4) Second ORDER POLYNOMIAL Lt = b! + b" TSL + b# TSL#
5) Third ORDER POLYNOMIAL Lt = b! + b" TSL + b# TSL# + b$ TSL$ EXST025  Biological Population Statistics MODEL DERIVATION
1) DIRECT PROPORTION  this method will work with only 1 fish
Ltt
St = Lt
TSL where
Ltt = length at some previous time (t)  UNKNOWN
St = Scale length at some previous time (t) determined by a scale annulus
measurement
Lt = length at capture
TSL = total scale length at capture for a given fish,
Lt
TSL is a constant for a particular fish (call it b" ) so let;
Ltt
St = b" represent the constant ratio and then
Ltt = b" * St is a direct proportion model this describes the relationship between the fishes length and the length of the scale
at any time (for a particular fish)
this is also the form of the predictive equation Page 4 EXST025  Biological Population Statistics EXAMPLE for a fish with TSL = 10 mm, Lt = 200 mm, then Lt/TSL = 20
if
S" = 4
S# = 7
S$ = 9 Lt" = 4 * 20 =80
Lt# = 7 * 20 140
=
Lt$ = 9 * 20 160
=
the constant proportion (ratio) [b" ] is given by
Lt
TSL = b" With many observations of Lti and TSLi for many fish, preferably over a
wide range of sizes, we may want an average of
Ltt
St = b" so, one way is to fit
Lti = b" TSLi
as a regression forced through the origin
this fits the relationship, and estimates an average “b" " over all fish
the relationship may be fitted by any one of a number of ways, we have
discussed 4 methods of fitting a ratio
 for the moment assume linear regression forced through the origin will be
adequate (assume homogeneous variance, but examine residuals)
 hardest assumption  TSL measured without error Page 5 EXST025  Biological Population Statistics GRAPHIC EXAMPLE OF BACK CALCULATION
RELATIONSHIP BETWEEN TOTAL SCALE LENGTH AND FISH LENGTH
AT THE TIME OF CAPTURE
TSL is the scale length at capture, but if we have a good range of sizes, and a good
description of the relationship, we can use the relationship (HOPEFULLY)
to describe the SCALE SIZE AT SOME PREVIOUS TIME
POTENTIAL MODELS TO DESCRIBE THE RELATIONSHIP (from literature)
1) DIRECT PROPORTION MODEL  as a regression
THE LINE DESCRIBED BY THIS RELATIONSHIP
(a) passes through the origin,
(b) has no curvature THIS IS THE SIMPLEST REGRESSION MODEL WHICH MAY BE
ADEQUATE TO DESCRIBE THE RELATIONSHIP Page 6 EXST025  Biological Population Statistics 2) SIMPLE LINEAR REGRESSION  derivation
a) tiny fish do not have scales
1) for FLIER SUNFISH, Squamation starts between 16  17 mm of size in an area and is completed in the head area at 32 mm length Page 7 EXST025  Biological Population Statistics 2) suppose direct proportion is correct but that we should add a constant since
squamation does not start at size zero
In fact, the “intercept" may or may not be the size at squamation
when scale growth starts it proceeds rapidly until the fish is covered often no fish of very small size (near size at squamation)
c) this model has an intercept, so it is a simple linear regression model
Lt = b! + b" (TSL) Page 8 EXST025  Biological Population Statistics Page 9 e) Model derivation
Lt = b" * TSL from our earlier derivation but an intercept correction is needed (call it b! , and let's consider that it represents
the size at squamation or the size at which scales first form in the key scale
area)
Ltt
St = Ltt  b!
St Lt
TSL = subtract b! from all length measurements Lt  b!
TSL Ltt  b! = Lt  b!
TSL where for a given fish the term St Ltb!
TSL is a constant b" for a given fish the estimated average over all fish, using regression, is
Ltt  b! = b" * St
Ltt = b! + b" *St
where b! may or may not be size at squamation
For FLIER SUNFISH; Conley and Witt (1966) said squamation starts at
16 17 mm in certain site,
I used that site and got 16.45 mm as the intercept
Draw your own conclusions EXST025  Biological Population Statistics OTHER MODELS
3) POWER MODEL  essentially a direct proportion model with an allowance for
simple curvature (still passes origin)
a) variation of the direct proportion model
Ltt = b! Sb"
t
so
Ltt
b
St " = b! which also represents a constant ratio of sorts, but with an adjustment for curvature
(power term: b" )
another representation would be
Log(Ltt ) / Log(Sb" ) = Log(b! )
t
so
log(Ltt ) / b" * Log(St ) = Log(b! )
b) best fit for FLIER sunfish
c) if b" = 1, then the equation reduces to a DIRECT PROPORTION model,
and the line is not curved
d) other statistical advantages  “cures" non  homogeneous variance
 THIS MAY BE USEFUL EVEN IF THE LINE IS NOT CURVED! Page 10 EXST025  Biological Population Statistics Page 11 THE LINE DESCRIBED BY THIS RELATIONSHIP
(a) passes through the origin,
(b) has curvature there is “no intercept" in that the line passes through origin, actually has no intercept
in the algebraic model, but does cross the Y axis at the origin
b! is the angle of the curve ( slope)
b" is the rate of curvature
though the line can be straight (no curve) if b" = 1
the Log  log model is given by
Lt = b! TSLb" when fitted on total measurements
Ltt = b! Stb" for backcalculation a) can be easily linearized by calculating
log(Ltt ) = log(b! ) + b" * log(St ) EXST025  Biological Population Statistics 4) SECOND ORDER POLYNOMIAL
LTt = b! + b" St + b# S#
t
a) no good theoretical derivation for this model
b) may fit well, but with extra degrees of freedom
THE LINE DESCRIBED BY THIS RELATIONSHIP
(a) does not necessarily pass through the origin,
(b) has curvature the direction of curvature depends on sign of b# Page 12 EXST025  Biological Population Statistics 5) THIRD ORDER POLYNOMIAL
LTt = b! + b" St + b# S# + b$ S$
t
t
THE LINE DESCRIBED BY THIS RELATIONSHIP
(a) does not necessarily pass through the origin,
(b) has curvature can curve in other direction (has up to 2 curvatures, 1 inflection) Page 13 EXST025  Biological Population Statistics Page 14 POLYNOMIALS
a) regular regression assumptions apply
since polynomials should not be extended outside their range, small fish are
needed in the sample to get an adequate fit of the lower range for
prediction
b) intercept and each inflection (level of curvature) employ a new degree of
freedom, so each is testable
c) the model may (hopefully will) reduce to a 2nd order polynomial
d) Polynomials in general
1) each addition term ALLOWS for a new inflection, but
2) addition terms may also fit changes in rate of curvature instead of inflection (can
tell somewhat by sign on bi )
3) WORTHLESS outside the range of OBSERVED values (DANGEROUS)
4) Useful curve fitting technique, but often no “biological" interpretation to the
regression coefficients
5) The regular assumptions for multiple regression apply
6) since only one Xi value used this multiple regression can employ residual plot of
^
ei * Xi (instead of ei * Yi used for other multiple regressions). EXST025  Biological Population Statistics STATISTICAL CONSIDERATIONS  in finding the “best" model
1) NOTE THAT THE FIRST CONSIDERATIONS SHOULD BE BIOLOGICAL,
NOT STATISTICAL  if you have a reason to expect a particular model
then that model should be fitted.
If it does not provide the “highest R# " this is not necessarily important
a) does it fit biological theory better than other models?
b) does it meet the statistical assumptions where the other models do not?
c) is the model preferable for any biological reason, and does it provide
nearly as good a fit as the other models? If so, USE IT!
 remember, 1 or 2% off in the R is not particularly important
d) examining the residual plots for the model fitted, this is at least as
important as the R# Page 15 EXST025  Biological Population Statistics Page 16 2) DIRECT PROPORTION model  simplest model
a) only model possible for a single fish unless other information available (eg.
the value of the intercept)
b) cannot test for curvature or intercept, so arrive at this model by
simplification of larger models
(ie. the larger models may “reduce" to this model)
3) SIMPLE LINEAR REGRESSION model
a) the intercept value can be tested (H! : "! = 0) and failure to reject implies
the DIRECT PROPORTION model
b) both of these model require standard set of assumptions about the
regression line,
 including homogeneous variance (additive error)
4) LOGLOG REGRESSION model
a) the slope can be tested (H! : "" = 1) and failure to reject implies the
DIRECT PROPORTION model
 this is a test for curvature
b) nonhomogeneity of the original data is implied by this model
5) How about SIMPLE LINEAR REGRESSION versus POWER REGRESSION
a) no good clean test
b) Check R# value
c) Check for homogeneous variance
d) hopefully only 1 will be better than direct proportion
6) If both an intercept and curvature are indicated POLYNOMIAL model (either
quadratic or cubic)
a) the intercept, linear trend and curvature can all be tested individually in this
model
b) the curve is not a “pretty" biologically as the simpler curve in the POWER
model
c) the curve should not be EXTENDED OUTSIDE the RANGE of the data EXST025  Biological Population Statistics 7) MODELS (counting POLYNOMIALS as one model)
DIRECT PROPORTION MODEL
NO CURVE TSL
PASS ORIGIN SIMPLE LINEAR REGRESSION
NO CURVE TSL
NOT PASS ORIGIN (has intercept) LOG  LOG REGRESSION Page 17 EXST025  Biological Population Statistics CURVE TSL
PASSES ORIGIN POLYNOMIAL REGRESSION
CURVE TSL
NOT PASS ORIGIN
(has intercept) These basic models will fit a wide range of experimental situations Page 18 EXST025  Biological Population Statistics Page 19 OTHER CONSIDERATIONS (MERISTIC RELATIONSHIPS)
a) The independent variable (Xi ) is probably not measured without error
 if we wish to test for instance b" = 1 for the POWER MODEL we probably have
an UNDERESTIMATE of b" (Ricker 1973 Linear regression in fishery
research, Can. J. Fish. Res. Bd. discusses situations and solutions)
b) Recent authors (Carlander, 1981, Fisheries. Am Fish Soc) caution against
the use of regression (which describes the relationship for the average fish
 CARLANDER suggests using a fish by fish basis for the calculations in the form
of a correction scale length entered into the regression
FIRST FIT Ltt = LT
TSL St = b" * St or
Ltt = b! + LT  b"
TSL St = b! + b" * St where b may come from regression or prior knowledge
 use regression to find best model (DP or SLR), THEN, for the back calculations,
instead of entering St into the equation to backcalculate, enter
f * St
where
f = observed TSL
estimated TSL this adjusts the TSL up or down proportional to the amount observed at the time of
capture, presuming that fish which are relatively large or small relative to
the average have always been so (throughout life), or at least was the same
at the time of circuli formation (but what about normal seasonal or sexual
variation?)
 this adjustment is proportional to the residual EXST025  Biological Population Statistics Page 20  schematic of adjustment suggested by Carlander Scale
Length Age1 Age2 Age3 @Capture Fish Length
 note that if the adjustment “f" is 1.1 (or 110 %) then adjusting each scale by 1.1
assumes that the proportion is and has always been 110 % for that fish
Consider: when can we average and when should we maintain individual fish
relationships.
1) Does a single fish maintain a set growth rate relative to other fish throughout his
life (for genetic reasons or other)
2) or does a fish vary his growth rate from year to year (or month to month) due to
changing habitat, breeding condition, compensation or other).
The truth is probably somewhere between the two possibilities (a combination of
both). EXST025  Biological Population Statistics must assume additive error for the linearized model (and therefore multiplicative
error for the original data).
Jerrold Zar cautions against automatically accepting this (Bioscience 18: 12,
1968)
eg.
LTi = b! TSLb" ei
i
Log(LTi ) = Log(b! ) + b" * Log(TSLi ) + Log(ei )
a fix up for nonhomogeneous variance in original data (desirable or not?)
d) LT = b! TSLb"
if b" = 1, then Lt = b! TSL which is a direct proportion
easily tested with a ttest
H! : b" = 1
t = b"  1
Sb " Page 21 EXST025  Biological Population Statistics Page 22 SAS program for the models discussed
1) under appropriate DATA statement (as data INPUT or in a later DATA statement
to get natural logarithms
INPUT ... LT TSL ... ;
LLT = LOG(LT);
LTSL = LOG(TSL);
to get power terms for polynomials we can use
TSL2 = TSL * TSL;
TSL3 = TSL * TSL2;
though this is not necessary (see below)
to get logarithms with base 10, change LOG to LOG10
eg. LLT = LOG10(LT);
2) to run the POWER model use
PROC GLM;
MODEL LLT = LTSL;
3) to obtain polynomial models use
PROC GLM;
MODEL LT = TSL TSL*TSL; EXST025  Biological Population Statistics Page 23 ALLOMETRY  refers to the relative rates of growth of body parts
eg.
Width = "" * Length
if Width is always 20% of length, then the equation is
Width = 0.20 * Length
but what if the SHAPE of the fish changes over time as it matures?
eg.
Width = 0.18 * Length for small fish
Width = 0.20 * Length for medium fish
Width = 0.22 * Length for larger fish
so there is a gradual change in body proportions Width GRAPHICALLY 0.22
0.20
0.18
Length EXST025  Biological Population Statistics Page 24 LENGTH  WEIGHT RELATIONSHIP
a) always use the model
Wi = "! Li"" %i
Log(Wi ) = Log("! ) + "" * Log(Li ) + Log(%i )
b) in SAS
INPUT ... LT WT ...;
LWT = LOG(WT);
LLT = LOG(LT);
PROC GLM:
MODEL LWT = LLT;
c) statistically
1) all assumptions apply except,
(a) multiplicative (nonhomogeneous) error is implied for raw data because of
the original model
(b) Ricker (1973) discusses the fact that length is probably not measured
without error EXST025  Biological Population Statistics Page 25 d) INTERPRETATION OF COEFFICIENTS  statistically we know what and are,
what do they mean biologically
1) relation between length and weight is relating mm to gms
 we need an adjustment of some sort (done by the units of the regression
coefficient )
 except that there is no good biological interpretation to gms
mm
2) however, we can fit a very nice relationship between grams and centimeters
if we know the ... specific gravity
eg. for water (pure) 1 cm = 1 gm.
3) so we may expect W = "! * L$
where is specific gravity, except that a fish is not shaped like a cube
4) imagine we are going to carve a fish from a cube such that the side of the cube
equals the length of the fish EXST025  Biological Population Statistics Page 26 eg. small cube p small fish large cube p large fish 5) if the fish keeps exactly the same shape all his/her life then he/she will occupy a
certain fraction of the cube eg. 0.05 or 5%
 This varies with fish, usually approximately 1% to 3% less for eel, more for
boxfish EXST025  Biological Population Statistics Page 27 6) so our model is
Weight = (fraction of cube) * (specific gravity) * Length
where;
(fraction of cube) * (sp. gr.) = b!
7) Is the slope always “3"?
a) maybe for fish, but ...
 grass keeps same width and cell thickness so weight depends only on length,
so if the "slope" = 1 (sausage)
W = "! L"
b) leaf maintains same cell thickness, increases in length and width, so we expect
that the "slope" = 2 (bolts of cloth where instead of L2 it is length *
width)
W = "! L#
8) Some things do not grow in 3 dimensions
most published values of “b" for fish are near 3, but some are as high as 6 (eg.
post larval menhaden)
Why?
 metamorphosis of post larval menhaden
 with little or no change in length, a fish becomes more full bodied
the fish grows in 1 or 2 dimensions, and we are measuring the WRONG
DIMENSION, any value of "" is possible
changes from "" = 3 can also occur with changes in robustness (sex
differences, health, etc.), since these often occur without a change in length
9) the use of the Weight  Length relationship is not restricted to fish
 any organism may follow a similar pattern, except those which undergo
metamorphosis
this is called allometric growth
 little elephants are recognizable because they look like big elephants EXST025  Biological Population Statistics CONDITION FACTORS, PONDERAL INDICES
A crude version is used in health studies,
the version in “fisheries" more refined
 (can be used for ANY ORGANISM)
traditional form
K= W
L$ which is adjusted to 1 for the metric system
 using length in cm. and weight in gms. will produce a number which is
approximately 1 * 10&
 by convention we adjust to 1, if units other than cm and gms are used (eg.
mm and kg) then the adjustment differs
essentially
K ¸ b! = (fraction. of cube) * (specific gravity)
assume specific gravity is approximately a constant at 1
Then b! is approximately the fraction of the cube occupied by fish
increases for fat fish
decreases for skinny fish
therefore, the value may be used as an indicator of
a) robustness, or condition (health)
b) sexual condition (females with eggs)
Traditionally, K analyzed with ANOVA to test for differences between
AREAS, SEASONS, SEXES, AGES, HABITATS etc. Page 28 EXST025  Biological Population Statistics Page 29 HOWEVER, THERE IS A PROBLEM WITH USING K IN ANOVA
a) given,
Wi = "! Li$ %i
and
Wi
L$
i = "! * %i or Ki * %i we can see that the error is not additive, and this is an assumption in ANOVA
b) frequently (Carlander, 1980) K is correlated (significantly) with length, which
implies
W = "! L$
L""
L$ W
L$ = "! W
L$ = "! L"" 3 , so "! is not free of LENGTH as it should be LE CREN (1951) suggests
Ci = Wi
"
Li " = “%i " but this centers on “1" and we know the mean %i Á 1, actually
Log(Wi ) = Log("! ) + Log(Li ) + Log(%i )
assume that Log(%i ) is distributed NID(0,5# )
Log(%i ) = Log(Wi )  cLog("! ) + "" *log(Li )d
LeCren's Ci = elog(%i ) = Wi
Li has a log normal distribution, then log (%i ) centers on 0,
and elog(%i ) centers on 1 EXST025  Biological Population Statistics Page 30 LeCren recognized that the best solution is ANACOV,
instead of ANOVA such as
Ci = SEX AREA;
use Log(Wi )  cLog(b! ) + b" *(Log(Li ))d = SEX AREA; which has a log normal distribution of %i (supposedly), normalized by taking
logarithms
the model to use is then
Log(Wijk ) = Log(b! ) + b" * Log(Li ) + SEXj + AREAk + Log(%ijk ) EXST025  Biological Population Statistics Page 31 ADVANTAGES
a) log(%i ) is still NID (0,5# )
b) b" is not fixed when the class variables are considered, this is advantageous
because suppose
IF b" fitted before SEX Y X THE WRONG SLOPE IS CALCULATED
BUT IF b" fitted simultaneously Y X CORRECT SLOPE and SEPARATE INTERCEPTS EXST025  Biological Population Statistics Page 32 DISADVANTAGE  more complicated AND, not a complete ANACOV,
a) with this approach, shouldn't we really be testing SEPARATE SLOPES as well?
simpler just to let K = W
L$ b) we lose the “condition factor" concept, what do we point to as a K value?
 remember in ANACOV
SEPARATE LEVELS with SAS REPARAMETERIZATION so the log(b! )
will be the logarithm of K (the intercept) for missing SEX and/or AREA
but without a common b, are the intercepts meaningless?
 the “regression coefficients" for the categorical variables are the differences
between log(b! ) for other categories
c) REMEMBER SCALE LENGTH  BODY LENGTH RELATIONSHIP
 since fish can enlarge scales and reabsorb scales on the edges, that which
affects robustness will affect (I BELIEVE) the back calculation formulas
 this is why I am not so sure that averaging across all fish, particularly when
captured at different times of the year, and in different stages of sexual
maturity, is necessarily a bad idea (can we assume a fish will have
consistently large or small scales given their ability to adjust scales to
robustness?) EXST 7025 Morphometrics Page 1 1
/*********************************************************************/
2
/*** Geaghan MS Thesis Flier data
***/
3
/*********************************************************************/
4
dm'log;clear;output;clear';
5
OPTIONS NOCENTER PS=52 LS=111 NODATE NONUMBER nolabel;
6
ODS HTML style=minimal
body='C:\Geaghan\Current\EXST7025\Spring2008\Flier\LtWt\Lt_Wt01.html' ;
NOTE: Writing HTML Body file:
C:\Geaghan\Current\EXST7025\Spring2008\Flier\LtWt\Lt_Wt01.html
7
8
ods graphics ON;
NOTE: ODS Statistical Graphics will require a SAS/GRAPH license when it is declared
production.
9
10
LIBNAME SASDATA 'C:\Geaghan\Current\EXST7025\Spring2008\Flier\';
NOTE: Libref SASDATA was successfully assigned as follows:
Engine:
V9
Physical Name: C:\Geaghan\Current\EXST7025\Spring2008\Flier
11
FILENAME INPUT
'C:\Geaghan\Current\EXST7025\Spring2008\Flier\ONEper.csv';
12
TITLE1 'North Carolina Flier sunfish data';
13
14
DATA Flier; infile input missover delimiter="," firstobs=2;
15
input Mo Day Yr Ar St Md sx sex $ Sn Dayno Age Age_Days Lt Wt TSL
16
Size1 Size2 Size3 Size4 Size5 Size6 Edge EdgeGrow K FNO;
17
if lt gt 14.0 and wt lt 20 then wt = .;
18
IF LT LE 0 THEN DELETE;
19
IF WT LE 0 THEN DELETE;
20
IF TSL LE 0 THEN DELETE;
21
LLT= LOG(LT);
22
LWT= LOG(WT);
23
LTSL= LOG(TSL);
24
LT2 = LT*LT; LT3 = LT*LT*LT;
25
RUN;
NOTE: The infile INPUT is:
File Name=C:\Geaghan\Current\EXST7025\Spring2008\Flier\ONEper.csv,
RECFM=V,LRECL=256
NOTE: 664 records were read from the infile INPUT.
The minimum record length was 66.
The maximum record length was 114.
NOTE: The data set WORK.FLIER has 658 observations and 30 variables.
NOTE: DATA statement used (Total process time):
real time
0.03 seconds
cpu time
0.03 seconds
25
!
;
26
27
proc plot data=flier; plot Lt * TSL = age; run;
28
NOTE: There were 658 observations read from the data set WORK.FLIER.
NOTE: The PROCEDURE PLOT printed page 1.
NOTE: PROCEDURE PLOT used (Total process time):
real time
0.14 seconds
cpu time
0.03 seconds EXST 7025 Morphometrics Page 2 North Carolina Flier sunfish data
Plot of Lt*TSL. Symbol is value of Age.
Lt 

20 +



6
18 +
3
6

3
4
4

63
4

3353
4
16 +
3 2 33 33 343 3

3 6 4
43632335

4 22232234342 444

2 34223422 33 333
4
14 +
3333332233234234
3
4

1
32232225232334 44444
3

2
433 3233222223 2 4 3 4

22 222223322 3233344 4
12 +
22222222222222 2

2 2122221223 3

2322322211232

2
2 12121223 2 3
10 +
0 02211121222 2

1 1111122111

11 131122112

111111
8 +
11111111

1 1101 11

0 10010

1 00 01
6 +
1
0001 11

10


4 +

0


2 +

++++++++50
100
150
200
250
300
350
400
TSL
NOTE: 388 obs hidden. 29
PROC REG DATA=FLIER lineprinter;
30
TITLE2 'Total Scale Length  Body Length relationship (SLR)';
31
MODEL LT = TSL;
32
plot residual.*lt;
33
RUN;
WARNING: Statistical graphics displays created with ODS are experimental in this
release.
NOTE: The PROCEDURE REG printed pages 23.
NOTE: PROCEDURE REG used (Total process time):
real time
5.78 seconds
cpu time
4.78 seconds
North Carolina Flier sunfish data
Total Scale Length  Body Length relationship (SLR)
The REG Procedure
Model: MODEL1
Dependent Variable: Lt
Number of Observations Read
Number of Observations Used 658
658 EXST 7025 Morphometrics Page 3 Analysis of Variance
Source
Model
Error
Corrected Total DF
1
656
657 Root MSE
Dependent Mean
Coeff Var 0.69603
12.08100
5.76136 Sum of
Squares
3691.38857
317.80397
4009.19254
RSquare
Adj RSq Mean
Square
3691.38857
0.48446 F Value
7619.64 0.9207
0.9206 Parameter Estimates
Variable
Intercept
TSL DF
1
1 Parameter
Estimate
1.65021
0.04172 Standard
Error
0.12254
0.00047790 t Value
13.47
87.29 Pr > t
<.0001
<.0001 Pr > F
<.0001 EXST 7025 Morphometrics Page 4 34
PROC REG DATA=FLIER lineprinter;
35
TITLE2 'Total Scale Length  Body Length relationship (Power)';
36
MODEL LLT = LTSL / clb;
37
plot residual.*Llt;
38
test LTSL = 1;
39
RUN;
WARNING: Statistical graphics displays created with ODS are experimental in this
release.
40
NOTE: The PROCEDURE REG printed pages 46.
NOTE: PROCEDURE REG used (Total process time):
real time
2.62 seconds
cpu time
1.59 seconds
North Carolina Flier sunfish data
Total Scale Length  Body Length relationship (Power)
The REG Procedure
Model: MODEL1
Dependent Variable: LLT
Number of Observations Read
Number of Observations Used 658
658 Analysis of Variance
Source
Model
Error
Corrected Total DF
1
656
657 Sum of
Squares
31.92806
1.99665
33.92471 Mean
Square
31.92806
0.00304 F Value
10490.0 Pr > F
<.0001 EXST 7025
Root MSE
Dependent Mean
Coeff Var Morphometrics
0.05517
2.46783
2.23554 Parameter Estimates
Parameter
Variable
DF
Estimate
Intercept
1
2.16788
LTSL
1
0.84423 RSquare
Adj RSq Standard
Error
0.04531
0.00824 Page 5
0.9411
0.9411 t Value
47.84
102.42 Test 1 Results for Dependent Variable LLT
Mean
Source
DF
Square
F Value
Numerator
1
1.08691
357.10
Denominator
656
0.00304 Pr > t
<.0001
<.0001 Pr > F
<.0001 95% Confidence Limits
2.25685
2.07890
0.82805
0.86042 EXST 7025 Morphometrics Page 6 41
proc plot data=flier; plot wt * lt = age; run;
NOTE: There were 658 observations read from the data set WORK.FLIER.
NOTE: The PROCEDURE PLOT printed page 7.
NOTE: PROCEDURE PLOT used (Total process time):
real time
0.12 seconds
cpu time
0.03 seconds
Plot of Wt*Lt. Symbol is value of Age.
Wt 
140 +
3

44

3
6

3

6
120 +


34 64

33

23
100 +
4
5

5 5 333

655
4

4 2 4 43 33

42442 33
80 +
4 3 23 333

23332532343

322 3233 23

23 3342 2 4

23423324333
60 +
4432333232 43 3

4 223223333233
3

2 3232332233 3

3222 2233 3

3222222 23
40 +
1222223323

22222213323333

2 222121222 33
1

2212222232

120012
22
20 +
2 11111 2

11111111

1011111112

00000 01 11

0 001
0 +
0
+++++++++2
4
6
8
10
12
14
16
18
Lt
NOTE: 421 obs hidden. EXST 7025 Morphometrics Page 7 43
PROC REG DATA=FLIER lineprinter;
44
TITLE2 'Length  weight relationship (Cubic)';
45
MODEL WT = LT LT2 LT3;
46
plot residual.*lt;
47
RUN;
48
NOTE: The PROCEDURE REG printed pages 89.
NOTE: PROCEDURE REG used (Total process time):
real time
2.14 seconds
cpu time
1.17 seconds
North Carolina Flier sunfish data
Length  weight relationship (Cubic)
The REG Procedure
Model: MODEL1
Dependent Variable: Wt
Number of Observations Read
Number of Observations Used 658
658 Analysis of Variance
Source
Model
Error
Corrected Total DF
3
654
657 Root MSE
Dependent Mean
Coeff Var 6.16317
45.49012
13.54836 Sum of
Squares
377321
24842
402163
RSquare
Adj RSq Mean
Square
125774
37.98463 F Value
3311.17 0.9382
0.9379 Parameter Estimates
Variable
Intercept
Lt
LT2
LT3 DF
1
1
1
1 Parameter
Estimate
18.17873
5.61158
0.50812
0.03697 Standard
Error
11.33560
3.12452
0.27746
0.00797 t Value
1.60
1.80
1.83
4.64 Pr > t
0.1093
0.0730
0.0675
<.0001 Pr > F
<.0001 EXST 7025 Morphometrics Page 8 49
PROC REG DATA=FLIER lineprinter;
50
TITLE2 'Length  weight relationship (Power)';
51
MODEL LWT = LLT / CLB;
52
test LLt = 3;
53
plot residual.*llt;
54
output out=next1 r=e p=yhat RSTUDENT=RSTUDENT;
55
RUN;
WARNING: Statistical graphics displays created with ODS are experimental in this
release.
NOTE: The data set WORK.NEXT1 has 658 observations and 33 variables.
NOTE: The PROCEDURE REG printed pages 1012.
NOTE: PROCEDURE REG used (Total process time):
real time
2.70 seconds
cpu time
1.65 seconds EXST 7025 Morphometrics Page 9 North Carolina Flier sunfish data
Length  weight relationship (Power)
The REG Procedure
Model: MODEL1
Dependent Variable: LWT
Number of Observations Read
Number of Observations Used 658
658 Analysis of Variance
Sum of
Squares
278.36136
9.76532
288.12668 Source
Model
Error
Corrected Total DF
1
656
657 Root MSE
Dependent Mean
Coeff Var 0.12201
3.63850
3.35328 Parameter Estimates
Parameter
Variable
DF
Estimate
Intercept
1
3.43057
LLT
1
2.86448 RSquare
Adj RSq Standard
Error
0.05191
0.02095 Mean
Square
278.36136
0.01489 F Value
18699.3 Pr > F
<.0001 0.9661
0.9661 t Value
66.08
136.75 Test 1 Results for Dependent Variable LWT
Source
DF
Mean Square
F Value
Numerator
1
0.62301
41.85
Denominator
656
0.01489 Pr > t
<.0001
<.0001 Pr > F
<.0001 95% Confidence Limits
3.53251
3.32864
2.82335
2.90562 EXST 7025 Morphometrics 56
data next1a; set next1;
57
if rstudent gt 2 and rstudent lt 2 then delete; run;
NOTE: There were 658 observations read from the data set WORK.NEXT1.
NOTE: The data set WORK.NEXT1A has 21 observations and 33 variables.
NOTE: DATA statement used (Total process time):
real time
0.00 seconds
cpu time
0.00 seconds
58
proc sort data=next1a; by RSTUDENT; run;
NOTE: There were 21 observations read from the data set WORK.NEXT1A.
NOTE: The data set WORK.NEXT1A has 21 observations and 33 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time
0.01 seconds
cpu time
0.01 seconds
59
proc print data=next1a; var fno lt wt e rstudent; run;
NOTE: There were 21 observations read from the data set WORK.NEXT1A.
NOTE: The PROCEDURE PRINT printed page 13.
NOTE: PROCEDURE PRINT used (Total process time):
real time
0.09 seconds
cpu time
0.03 seconds Page 10 EXST 7025 Morphometrics North Carolina Flier sunfish data
Length  weight relationship (Power)
Obs
FNO
Lt
Wt
e
1
21
5.5
2.3
0.61974
2
115
13.5
33.2
0.52224
3
561
15.7
55.1
0.44810
4
56
8.0
8.1
0.43409
5
198
11.2
22.8
0.36301
6
196
11.1
22.6
0.34613
7
24
6.2
4.4
0.31422
8
164
10.3
18.9
0.31066
9
449
12.9
36.6
0.29452
10
378
12.2
31.9
0.27215
11
403
12.3
32.7
0.27076
12
199
11.2
25.2
0.26293
13
375
12.2
32.3
0.25968
14
231
11.9
30.1
0.25891
15
434
12.7
36.3
0.25799
16
536
14.6
54.2
0.25649
17
338
11.5
27.7
0.24406
. . .
18
637
14.4
87.7
0.26426
19
229
11.8
50.0
0.27276
20
586
12.6
60.5
0.27548
21
139
9.8
33.7
0.41022 Page 11 RSTUDENT
5.22958
4.34270
3.71563
3.60093
2.99574
2.85471
2.60461
2.55956
2.42499
2.23909
2.22763
2.16278
2.13584
2.12937
2.12193
2.11078
2.00650
2.17499
2.24417
2.26685
3.39337 How many are expected to exceed 2?
About 5%, so for 658 observations
that would be about 33 obs
(actually for "2" alpha is 0.0459
so we expect 31).
Using a Bonferroni adjustment,
the probability is 0.05 / 658 =
0.0000759878, and the t value is
3.982.
So for 658 tests we would reject
when t > 3.982 with a 5% chance
of error overall for all tests together,
jointly. 61
OPTIONS PS=512 LS=111;
62
PROC REG DATA=FLIER;
63
TITLE2 'Length  weight relationship (Power alternative)';
64
MODEL LWT = LLT;
65
restrict lLt = 3;
66
RUN;
67
NOTE: The PROCEDURE REG printed page 14.
NOTE: PROCEDURE REG used (Total process time):
real time
2.67 seconds
cpu time
1.40 seconds North Carolina Flier sunfish data
Length  weight relationship (Power alternative)
The REG Procedure
Model: MODEL1
Dependent Variable: LWT
NOTE: Restrictions have been applied to parameter estimates.
Number of Observations Read
Number of Observations Used 658
658 Analysis of Variance
Source
Model
Error
Corrected Total DF
0
657
657 Sum of
Squares
277.73835
10.38833
288.12668 Mean
Square
.
0.01581 F Value
. Pr > F
. EXST 7025 Morphometrics Root MSE
Dependent Mean
Coeff Var 0.12574
3.63850
3.45596 RSquare
Adj RSq Page 12
0.9639
0.9639 Parameter Estimates
Variable
Intercept
LLT
RESTRICT DF
1
1
1 Parameter
Estimate
3.76500
3.00000
4.59734 Standard
Error
0.00490
0
0.73240 t Value
768.05
Infty
6.28 Pr > t
<.0001
<.0001
<.0001* * Probability computed using beta distribution. exp(3.765) = 0.023167612, or 2.3%
68
PROC MIXED DATA=FLIER; TITLE2 'Length  weight relationship (ANCOVA tests)';
69
class ar st;
70
MODEL LWT = LLT Ar LLt*Ar / solution cl;
71
random st(ar) llt*St(Ar);
72
RUN;
NOTE: Convergence criteria met.
NOTE: The PROCEDURE MIXED printed page 15.
NOTE: PROCEDURE MIXED used (Total process time):
real time
0.21 seconds
cpu time
0.09 seconds North Carolina Flier sunfish data
Length  weight relationship (ANCOVA tests)
The Mixed Procedure
Model Information
Data Set
Dependent Variable
Covariance Structure
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method WORK.FLIER
LWT
Variance Components
REML
Profile
ModelBased
Containment Class Level Information
Class
Levels
Values
Ar
3
1 2 3
St
7
1 2 3 4 5 6 7
Dimensions
Covariance Parameters
Columns in X
Columns in Z
Subjects
Max Obs Per Subject
Number
Number
Number
Number of
of
of
of Observations
Observations Read
Observations Used
Observations Not Used 3
8
28
1
658 658
658
0 EXST 7025 Morphometrics Iteration History
Iteration
Evaluations
0
1
1
3
2
2
3
2
4
1
5
1
Convergence criteria met. 2 Res Log Like
916.40341166
1083.64456426
1084.86561114
1085.19148930
1085.21960298
1085.21994476 Page 13
Criterion
.
0.00024548
0.00002307
0.00000030
0.00000000 Covariance Parameter Estimates
Cov Parm
Estimate
St(Ar)
0.002470
LLT*St(Ar)
0.000486
Residual
0.01031
Fit Statistics
2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better) 1085.2
1079.2
1079.2
1077.3 Solution for Fixed Effects
Effect
Intercept
LLT
Ar
Ar
Ar
LLT*Ar
LLT*Ar
LLT*Ar Ar 1
2
3
1
2
3 Estimate
3.5798
2.9447
0.2123
0.2551
0
0.04245
0.1222
0 Standard
Error
0.2087
0.08489
0.2417
0.2178
.
0.09758
0.08870
. DF
11
9
11
11
.
9
9
. Type 3 Tests of Fixed Effects
Num
Den
Effect
DF
DF
F Value
LLT
1
9
7526.99
Ar
2
11
6.10
LLT*Ar
2
9
5.03 t Value
17.15
34.69
0.88
1.17
.
0.44
1.38
. Pr > t
<.0001
<.0001
0.3984
0.2662
.
0.6738
0.2017
. Alpha
0.05
0.05
0.05
0.05
.
0.05
0.05
. Lower
4.0392
2.7527
0.7442
0.2243
.
0.1783
0.3228
. Upper
3.1205
3.1368
0.3196
0.7345
.
0.2632
0.07848
. Pr > F
<.0001
0.0165
0.0342 74 PROC MIXED DATA=FLIER; TITLE2 'Length  weight relationship (ANCOVA estimates)';
75
class ar st;
76
MODEL LWT = Ar LLt*Ar / solution cl noint;
77
random st(ar) llt*St(Ar);
78
RUN;
NOTE: Convergence criteria met.
NOTE: The PROCEDURE MIXED printed page 16.
NOTE: PROCEDURE MIXED used (Total process time):
real time
0.25 seconds
cpu time
0.06 seconds North Carolina Flier sunfish data
Length  weight relationship (ANCOVA estimates)
The Mixed Procedure
Model Information
Data Set
Dependent Variable WORK.FLIER
LWT EXST 7025 Morphometrics Covariance Structure
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method Page 14 Variance Components
REML
Profile
ModelBased
Containment Class Level Information
Class
Levels
Values
Ar
3
1 2 3
St
7
1 2 3 4 5 6 7
Dimensions
Covariance Parameters
Columns in X
Columns in Z
Subjects
Max Obs Per Subject
Number
Number
Number
Number of
of
of
of 3
6
28
1
658 Observations
Observations Read
Observations Used
Observations Not Used Iteration History
Iteration
Evaluations
0
1
1
3
2
2
3
2
4
1
5
1
Convergence criteria met. 658
658
0 2 Res Log Like
916.40341165
1083.64456425
1084.86561114
1085.19148930
1085.21960297
1085.21994475 Criterion
.
0.00024548
0.00002307
0.00000030
0.00000000 Covariance Parameter Estimates
Cov Parm
Estimate
St(Ar)
0.002470
LLT*St(Ar)
0.000486
Residual
0.01031
Fit Statistics
2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better) 1085.2
1079.2
1079.2
1077.3 Solution for Fixed Effects
Effect
Ar
Ar
Ar
LLT*Ar
LLT*Ar
LLT*Ar Ar
1
2
3
1
2
3 Estimate
3.7922
3.3247
3.5798
2.9872
2.8225
2.9447 Standard
Error
0.1218
0.06236
0.2087
0.04811
0.02571
0.08489 Type 3 Tests of Fixed Effects
Num
Den
Effect
DF
DF
F Value
Ar
3
11
1368.45
LLT*Ar
3
9
5704.44 DF
11
11
11
9
9
9 t Value
31.12
53.32
17.15
62.09
109.79
34.69 Pr > F
<.0001
<.0001 Pr > t
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001 Alpha
0.05
0.05
0.05
0.05
0.05
0.05 Lower
4.0603
3.4620
4.0392
2.8783
2.7644
2.7527 Upper
3.5240
3.1875
3.1205
3.0960
2.8807
3.1368 EXST 7025 Morphometrics Page 15 80
PROC MIXED DATA=FLIER; TITLE2 'Lengthweight (reduced ANCOVA estimates)';
81
class ar st;
82
MODEL LWT = Ar LLt / solution cl noint;
83
random st(ar) llt*St(Ar);
84
RUN;
NOTE: Convergence criteria met.
NOTE: The PROCEDURE MIXED printed page 17.
NOTE: PROCEDURE MIXED used (Total process time):
real time
0.29 seconds
cpu time
0.15 seconds
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 275132414
NOTE: The SAS System used:
real time
18.29 seconds
cpu time
11.62 seconds North Carolina Flier sunfish data
Length  weight relationship (reduced ANCOVA estimates)
The Mixed Procedure
Model Information
Data Set
Dependent Variable
Covariance Structure
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method WORK.FLIER
LWT
Variance Components
REML
Profile
ModelBased
Containment Class Level Information
Class
Levels
Values
Ar
3
1 2 3
St
7
1 2 3 4 5 6 7
Dimensions
Covariance Parameters
Columns in X
Columns in Z
Subjects
Max Obs Per Subject
Number
Number
Number
Number of
of
of
of 3
4
28
1
658 Observations
Observations Read
Observations Used
Observations Not Used Iteration History
Iteration
Evaluations
0
1
1
3
2
2
3
1
4
1
5
1
Convergence criteria met. 658
658
0 2 Res Log Like
903.64238448
1081.25456018
1082.23502395
1082.40237340
1082.41545464
1082.41561757 Covariance Parameter Estimates
Cov Parm
Estimate
St(Ar)
0.002984
LLT*St(Ar)
0.000674
Residual
0.01039 Criterion
0.00072261
0.00012736
0.00001072
0.00000014
0.00000000 EXST 7025 Morphometrics Fit Statistics
2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better) Page 16 1082.4
1076.4
1076.4
1074.5 Solution for Fixed Effects
Effect
Ar
Ar
Ar
LLT Ar
1
2
3 Estimate
3.5154
3.4152
3.3933
2.8672 Standard
Error
0.07155
0.05723
0.06972
0.02266 DF
11
11
11
11 Type 3 Tests of Fixed Effects
Num
Den
Effect
DF
DF
F Value
Ar
3
11
1320.50
LLT
1
11
16012.6 t Value
49.13
59.68
48.67
126.54 Pr > t
<.0001
<.0001
<.0001
<.0001 Alpha
0.05
0.05
0.05
0.05 Lower
3.6729
3.5412
3.5467
2.8173 Upper
3.3579
3.2893
3.2398
2.9170 Pr > F
<.0001
<.0001 Plot of Resid*Lt. Symbol is value of Age.
Resid 

0.4 +
2


1



2 2
34
3
0.2 +
1 1
2 2
2
3
5
3

0
2 1
0 32
132 43423
34
4

00
1
12 222 22 23 333323 2333 44
4 5 255
34

1
0
1
1 1 2
2 32 222 22232324233 332
5 33

11 1 11 2 2022122 2322
232 23332322 2 42 43 45
3

010 0 1111
1112 11 2212222223222323 32234 33 3233 44 3 3 6
0.0 +0011111221113222123223222322232333443343
0
0 1
1 1 11212212 2 332222 222422433 243 236 3 3
4
6 6

0
1 1 1 1 112 21
2 21222 22 23343323 3 3 3 333

1 1
1 21
22 1 2323 3223 242 33232
233

11
1
1
2
23 23333 324
4333 3 3
4

1
22
3 233 33 4343
0.2 +
0
1
1
2 3
3
4

1
2
3
4
3

1
22 2 3



0.4 +




1
1
3

0.6 +

+++++++++2
4
6
8
10
12
14
16
18
Lt
NOTE: 290 obs hidden.
Histogram
0.425+*
.
.*
.
.**
.*******
.****************
.*************************
.*************************************
.*********************************
.************************
.**************
.******
.**
.**
.
.
.
.
0.525+*
+++++++* may represent up to 4 counts Tests for Normality
Test
ShapiroWilk StatisticW
0.966697 #
1 Boxplot
0 1 0 7
25
62
100
147
130
93
54
21
8
6 3 


++
*+*


++


0
0 * p ValuePr < W
<0.0001 ...
View Full
Document
 Spring '08
 Geaghan,J

Click to edit the document details