EXST7015 Fall2011 Appendix 11

EXST7015 Fall2011 Appendix 11 - Statistical Techniques II...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 223 1 dm'log;clear;output;clear'; 2 /* 3 *---+----1----+----2----+----3----+----4----+----5----+----6----+; 4 Data Set source: © 1996 The Data and Story Library. 5 Story Names: Forbes 500 Companies Sales 6 Reference: Forbes, 1986 7 Authorization: free use 8 Description: Facts about companies selected from the Forbes 500 9 list for 1986. This is a 1/10 systematic sample from the 10 alphabetical list of companies. The Forbes 500 includes all 11 companies in the top 500 on any of the criteria, and thus 12 has almost 800 companies in the list. 13 *---+----1----+----2----+----3----+----4----+----5----+----6----+8; 14 */ 15 options nodate nocenter nonumber ps=512 ls=99 nolabel; 16 ODS HTML style=minimal rs=none 17 body='C:\SAS\14s-AnCova-Forbes.html' ; NOTE: Writing HTML Body file: C:\SAS\14s-AnCova-Forbes.html 18 filename input 'C:\SAS\Forbes.csv'; 19 20 options nocenter nodate nonumber ls=80 ps=256; 21 data forbes; length company $ 32 sector $ 15; 22 title1 'Forbes companies assets and sales'; 23 infile input firstobs=2 dlm=',' dsd; 24 input Company $ Assets Sales Market_Value Profits Cash_Flow 24 ! Employees Sector $; 25 LAssets = log(assets); 26 LSales = log(sales); 27 run; NOTE: The infile INPUT is: File Name=C:\SAS\Forbes.csv, RECFM=V,LRECL=256 NOTE: 79 records were read from the infile INPUT. The minimum record length was 39. The maximum record length was 72. NOTE: The data set WORK.FORBES has 79 observations and 10 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 28 29 options ls=111 ps=111; 30 proc print data=forbes; run; NOTE: There were 79 observations read from the data set WORK.FORBES. NOTE: The PROCEDURE PRINT printed page 1. NOTE: PROCEDURE PRINT used (Total process time): real time 0.15 seconds cpu time 0.07 seconds Forbes companies assets and sales O b s 1 2 3 c o m p a n y Air Products Allied Signal American Electric Power s e c t o r Other Other Energy A s s e t s 2687 13271 13621 S a l e s 1870 9115 4848 M a r k e t _ V a l u e 1890 8190 4572 P r o f i t s 145.7 -279.0 485.0 C E a m s p h l _ o F y l e o e w s 352.2 18.2 83.0 143.8 898.9 23.4 L A s s e t s 7.8962 9.4933 9.5194 L S a l e s 7.5337 9.1177 8.4863 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 American Savings Bank FSB AMR Apple Computer Armstrong World Industries Bally Manufacturing Bank South Bell Atlantic H&R Block Brooklyn Union Gas California First Bank CBI Industries Central Illinois Public Service Cigna Cleveland Electric Illuminating Columbia Gas System Community Psychiatric Centers Continental Telecom Crown Cork & Seal Dayton-Hudson Digital Equipment Dillard Department Stores Dreyfus Eg&G Ex-Cell-O First American First Empire State First Tennessee National Florida Progress Fruehauf General Electric Giant Food Great A&P Tea Halliburton Hewlett-Packard Hospital Corp of America Idaho Power IBM IU International Kansas Power & Light Kroger Liz Claiborne LTV Marine Corp May Department Stores Mellon Bank Mesa Petroleum Montana Power National City NCR Norstar Bancorp Norwest Owens-Corning Fiberglas Pan Am Peoples Energy Phillips Petroleum PPG Industries Public Service Co of New Mexico Republic Airlines AH Robins San Diego Gas & Electric Shared Medical Systems Southeast Banking Sovran Financial Stop & Shop Cos Supermarkets General Telex Textron TWA Turner United Financial Group United Technologies Valero Energy Warner Communications Western Air Lines Wickes Cos FW Woolworth Appendix 11 Finance Transportation HiTech Manufacturing Other Finance Communication Finance Energy Finance Manufacturing Energy Finance Energy Energy Medical Communication Other Retail HiTech Retail Finance HiTech Other Finance Finance Finance Energy Manufacturing HiTech Retail Retail Manufacturing HiTech Medical Energy HiTech Transportation Energy Retail Other Manufacturing Finance Retail Finance Energy Energy Finance HiTech Finance Finance Manufacturing Transportation Energy Energy Manufacturing Energy Transportation Medical Energy Medical Finance Finance Retail Retail HiTech Manufacturing Transportation Manufacturing Finance Manufacturing Energy Other Transportation Retail Retail 3614 6425 1022 1093 1529 2788 19788 327 1117 5401 1128 1633 44736 5651 5835 278 5074 866 4418 6914 862 401 430 799 4789 2548 5249 3494 1804 26432 623 1608 4662 5769 6259 1654 52634 999 1679 4178 223 6307 3720 3442 33406 1257 1743 12505 3940 8998 21419 2366 2448 1440 14045 4084 3010 1286 707 3086 252 11052 9672 1112 1104 478 10348 2769 752 4989 10528 1995 2286 952 2957 2535 367 90 6131 2448 1754 1370 1679 1070 1295 444 271 304 9084 10636 542 959 1038 478 550 376 1516 430 701 679 16197 4653 1254 2002 4053 1601 205 853 2557 1892 1487 944 8793 4459 7029 7957 1601 1093 176 1084 1155 1045 1140 683 453 367 264 181 527 346 1653 1442 2564 483 28285 33172 2247 797 6615 829 4781 2988 6571 9462 4152 3090 451 779 50056 95697 1878 393 1354 687 17124 2091 557 1040 8199 598 356 211 5080 2673 3222 1413 355 181 597 717 1302 702 4317 3940 882 988 2516 930 3305 1117 3484 1036 1617 639 15636 2754 4346 3023 749 1120 1734 361 706 275 1739 1507 312 883 1097 606 1037 829 3689 542 5123 910 672 866 5721 1915 3725 663 2149 101 518 53 14992 5377 2662 341 2235 2306 1307 309 2806 457 5958 1921 SAS Example Page 224 14.1 345.8 72.0 100.9 25.6 23.5 1092.9 54.1 59.7 25.6 -47.0 74.3 -732.5 310.7 -93.8 44.8 239.9 71.7 283.6 400.6 66.9 55.6 55.7 57.6 40.2 22.2 37.8 160.9 70.5 2336.0 57.0 56.1 28.7 482.0 283.7 84.8 6555.0 -173.5 93.8 180.8 60.6 -771.5 26.6 235.4 201.7 167.5 121.6 108.4 315.2 93.0 107.6 131.2 48.8 81.7 418.0 302.7 146.3 69.2 61.4 202.7 41.7 64.9 92.6 30.3 63.7 67.1 223.6 -208.4 11.1 -3.1 312.7 34.7 195.3 35.4 40.6 177.0 24.6 682.5 119.5 164.5 137.0 28.9 2576.8 72.5 91.7 37.5 26.7 135.9 -651.9 407.9 173.8 50.5 578.3 115.4 456.5 754.7 106.8 57.0 70.8 89.2 51.4 26.2 56.2 320.3 164.9 3562.0 93.8 134.0 371.5 792.0 524.5 130.4 9874.0 -108.1 154.6 390.4 63.7 -524.3 34.8 361.5 246.7 304.0 172.4 131.4 566.3 119.0 164.7 256.5 257.1 126.4 1462.0 521.7 209.2 145.7 77.8 335.2 60.6 97.6 118.2 96.9 133.3 101.6 322.5 12.4 15.2 -0.3 710.7 100.7 219.0 92.8 93.5 288.0 1.1 49.5 4.8 20.8 19.4 2.1 79.4 2.8 3.8 4.1 13.2 2.8 48.5 6.2 10.8 3.8 21.9 12.6 128.0 87.3 16.0 0.7 22.5 15.4 3.0 2.1 4.1 6.4 26.6 304.0 18.6 65.0 66.2 83.0 62.0 1.6 400.2 23.3 4.6 164.6 1.9 57.5 2.4 77.3 15.8 0.6 3.5 9.0 62.0 7.4 15.6 25.2 25.4 3.5 27.3 37.5 3.4 14.3 6.1 4.9 3.3 7.0 8.2 43.5 48.5 5.4 49.5 29.1 2.6 0.8 184.8 2.3 8.0 10.3 50.0 118.1 8.1926 5.9054 8.7680 8.7211 6.9295 7.4697 6.9967 7.4260 7.3324 7.1663 7.9331 5.6021 9.8928 9.1143 5.7900 6.2953 7.0184 6.9451 8.5943 6.3099 7.0282 7.3238 7.3982 6.5525 10.7085 9.6926 8.6396 7.1341 8.6716 8.3072 5.6276 5.3230 8.5319 7.8466 6.7639 7.3045 8.3934 9.0817 8.8413 8.8578 6.7593 7.3784 5.9940 5.1705 6.0638 7.0519 6.6834 7.0388 8.4741 6.1159 7.8431 5.5759 8.5658 6.2672 8.1588 7.4103 7.4978 7.8493 10.1823 10.2501 6.4345 7.7174 7.3827 8.7971 8.4472 8.4724 8.6603 8.7904 8.7418 8.3313 7.4110 6.1115 10.8711 10.8209 6.9068 7.5380 7.4260 7.2108 8.3376 9.7482 5.4072 6.3226 8.7494 9.0118 8.2215 5.8749 8.1438 8.5331 10.4165 8.0778 7.1365 5.8721 7.4634 6.3919 9.4339 7.1717 8.2789 8.3703 9.1048 6.7822 9.9720 7.8304 7.7690 8.1032 7.8030 8.1559 7.2724 7.3883 9.5500 9.6573 8.3148 8.3770 8.0097 6.6187 7.1593 7.4582 6.5610 6.5596 8.0346 7.4611 5.5294 5.7430 9.3104 7.0003 9.1770 6.9441 7.0139 8.2131 7.0067 8.5415 6.1696 6.5103 9.2445 8.6519 7.9262 8.2228 6.6227 7.6728 8.5150 6.2500 9.2618 9.6153 7.5984 7.8868 7.7346 7.7120 6.8586 7.1755 7.9919 7.9395 7.8379 8.6925 31 32 options ls=111 ps=56; 33 proc plot data=forbes; plot Sales*Assets=sector; 34 title2 'Scatterplot of raw data'; run; NOTE: There were 79 observations read from the data set WORK.FORBES. NOTE: The PROCEDURE PLOT printed page 2. NOTE: PROCEDURE PLOT used (Total process time): real time 0.03 seconds cpu time 0.00 seconds James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 225 Forbes companies assets and sales Scatterplot of raw data Plot of Sales*Assets. Symbol is value of sector. Sales | | 50000 + H | | | | | | 40000 + | | | | | | 30000 + | H | | | | | 20000 + | | R | E F | M | | 10000 + | R M O C | R H H | R RR T M | R T HMEM E | R MR C F F | HM OE E FFF F 0 + FFE FF FF | ---+-------------+-------------+-------------+-------------+-------------+-------------+-0 10000 20000 30000 40000 50000 60000 Assets NOTE: 32 obs hidden. Examine the scatter plot. The untransformed data shows little pattern with no clear trends. The log transformed data shows some increasing trend. Logarithmic transformations of both the dependent and independent variable not only produces a plot that does appear to show some clear increasing trends, but it also appears that some sectors are at higher or lower levels than other sectors. We want to know if the apparent trends are significant and we want to know if the same trend is apparent for each sector. The log transformed values will be used. 35 proc plot data=forbes; plot LSales*LAssets=sector; 36 title2 'Scatterplot of log transformed data'; run; 37 options ls=80 ps=256; NOTE: There were 79 observations read from the data set WORK.FORBES. NOTE: The PROCEDURE PLOT printed page 3. NOTE: PROCEDURE PLOT used (Total process time): real time 0.04 seconds cpu time 0.00 seconds James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 226 Forbes companies assets and sales Scatterplot of log transformed data Plot of LSales*LAssets. Symbol is value of sector. LSales | 11 + | H | | | | H | 10 + | | R E F | M | | | R O C 9 + M | R H H | R T M | R R | H M E | T EM | R M F 8 + R | ME C F | R M O | T O | R HM T E E E | O M | T OE E F 7 + H O E F F | | F | H M E E | E | O F FF | E F 6 + | E F | M | FF | | M | F 5 + | ---+-------------+-------------+-------------+-------------+-------------+-------------+-5 6 7 8 9 10 11 LAssets NOTE: 4 obs hidden. This will be fitted in proc mixed because we want to use type 1 SS more than type 2 or 3. 39 proc mixed data=forbes; classes sector; 40 title2 'Basic Analysis of Covariance using PROC MIXED'; 41 title3 'Model for separate slopes'; 42 model LSales = LAssets sector LAssets*sector 43 / htype=1 3 solution outp=resids1; 44 run; NOTE: The data set WORK.RESIDS1 has 79 observations and 17 variables. NOTE: The PROCEDURE MIXED printed page 4. NOTE: PROCEDURE MIXED used (Total process time): real time 0.12 seconds cpu time 0.09 seconds James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 227 The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.FORBES LSales Diagonal REML Profile Model-Based Residual Class Level Information Class Levels Values sector 9 Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Number Number Number Number of of of of 1 20 0 1 79 Observations Observations Read Observations Used Observations Not Used 79 79 0 Covariance Parameter Estimates Cov Parm Estimate Residual 0.2574 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 125.7 127.7 127.8 129.8 Solution for Fixed Effects Effect Intercept LAssets sector sector sector sector sector sector sector sector sector LAssets*sector LAssets*sector LAssets*sector LAssets*sector LAssets*sector LAssets*sector LAssets*sector LAssets*sector LAssets*sector sector Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Estimate 2.0703 0.7672 -2.1709 -2.4870 -1.2677 -0.4847 0.6026 -1.3999 0.5938 1.0745 0 0.1642 0.2024 -0.08879 0.07274 -0.06947 0.1124 -0.1134 -0.06083 0 Standard Error 2.3335 0.3070 5.4006 2.6861 2.4808 2.5049 2.7364 2.6826 2.6311 2.9597 . 0.6101 0.3492 0.3219 0.3255 0.3547 0.3644 0.3480 0.3902 . DF 61 61 61 61 61 61 61 61 61 61 . 61 61 61 61 61 61 61 61 . t Value 0.89 2.50 -0.40 -0.93 -0.51 -0.19 0.22 -0.52 0.23 0.36 . 0.27 0.58 -0.28 0.22 -0.20 0.31 -0.33 -0.16 . Pr > |t| 0.3784 0.0152 0.6891 0.3582 0.6112 0.8472 0.8264 0.6037 0.8222 0.7178 . 0.7887 0.5644 0.7836 0.8239 0.8454 0.7588 0.7455 0.8766 . James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 228 We will start with the full model results, and see if we need separate slopes (test MSX1*X2 | X1 X2). The X1 represents the quantitative covariable and the X2 represents the sector class variable. Note that in this case we have 9 sectors, so that actually we need 8 variables to represent the 9 sectors. PROC MIXED will set these up when we place sector in the class statement. The first test is of MSX1*X2 | X1 X2. If not significant we do not need separate slopes. We will then want to know if we need separate intercepts (test MSX2 | X1). If not we reduce to a simple linear regression (MSX1). These tests are done by the TYPE I SS tests in SAS, examined in reverse order. Type 1 Tests of Fixed Effects Num Den Effect DF DF LAssets 1 61 sector 8 61 LAssets*sector 8 61 F Value 148.84 28.16 0.49 Pr > F <.0001 <.0001 0.8611 Note 9 d.f. for sectors and for sector interactions (one intercept or slope plus 8 adjustments). Also note that the sector interaction is not significant and could be removed from the model. This is equivalent to testing the ExtraSS for the interaction, except that PROC MIXED does not actually calculate or use sums of squares. So, we do not need a separate line for each sector (fitted by the sector*asset interaction). Do we need to consider sectors at all to give us separate intercepts? Since we used Type I SS the variable SECTOR fitted after ASSETS fits separate intercepts and provides a test of the need for separate intercepts. Examining the test of sector differences we see highly significant differences. We also see a strong significant correlation (slope) between the logarithm of Sales and Assets, so there is a need for both a linear (after taking logs) trend and separate intercepts. This would be our final model, and we would want to refit the model to get the parameter estimates. This is done below. Type 3 Tests of Fixed Effects Num Den Effect DF DF LAssets 1 61 sector 8 61 LAssets*sector 8 61 F Value 88.12 0.71 0.49 Pr > F <.0001 0.6812 0.8611 45 PROC UNIVARIATE DATA=Resids1 PLOT NORMAL; VAR resid; RUN; NOTE: The PROCEDURE UNIVARIATE printed page 5. NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 0.06 seconds cpu time 0.01 seconds 45 ! QUIT; James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 229 Forbes companies assets and sales Basic Analysis of Covariance using PROC MIXED Model for separate slopes The UNIVARIATE Procedure Variable: Resid Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 79 0 0.44870044 1.1058385 15.7039024 . Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean Basic Statistical Measures Location Variability Mean 0.00000 Std Deviation Median -0.02873 Variance Mode . Range Interquartile Range Tests for Location: Mu0=0 Test -StatisticStudent's t t 0 Sign M -2.5 Signed Rank S -135 Tests for Normality Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling 79 0 0.20133208 2.79918241 15.7039024 0.05048274 0.44870 0.20133 2.47562 0.50154 -----p Value-----Pr > |t| 1.0000 Pr >= |M| 0.6530 Pr >= |S| 0.5129 --Statistic--W 0.933133 D 0.113264 W-Sq 0.167842 A-Sq 1.074201 -----p Value-----Pr < W 0.0005 Pr > D 0.0135 Pr > W-Sq 0.0145 Pr > A-Sq 0.0080 Quantiles (Definition 5) Quantile Estimate 100% Max 1.6247101 99% 1.6247101 95% 0.8140317 90% 0.4797469 75% Q3 0.2086435 50% Median -0.0287338 25% Q1 -0.2928941 10% -0.5412676 5% -0.6577250 1% -0.8509100 0% Min -0.8509100 Extreme Observations ------Lowest-----Value Obs -0.850910 78 -0.826421 17 -0.731013 60 -0.657725 39 -0.630941 49 ------Highest----Value Obs 0.753481 57 0.814032 58 0.935886 75 1.564408 11 1.624710 16 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Stem 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -0 -1 -2 -3 -4 -5 -6 -7 -8 Leaf 2 6 4 1 15 Appendix 11 SAS Example Page 230 Boxplot 1 1 0 0 1 1 2 | | | | | | | +-----+ | | | + | *-----* | | +-----+ | | | | | | 6 1 3458 4 0278 4 112356 6 0011227 7 0011113679 10 988876331 9 7653210 7 996500 6 653320 6 7643 4 8541 4 63 2 3 1 53 2 ----+----+----+----+ Multiply Stem.Leaf by 10**-1 Normal Probability Plot 1.65+ * | * | | | 1.15+ + | ++ | * ++ | *+++ | **+ 0.65+ ++ | ++* | ++*** | ++*** | +**** 0.15+ +*** | +**** | **** | *** | *** -0.35+ **+ | ***+ | **** | **++ | *++ -0.85+ * *++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 231 47 proc mixed data=forbes; classes sector; 48 title2 'Basic Analysis of Covariance using PROC MIXED'; 49 title3 'Model for common slopes'; 50 model LSales = LAssets sector / htype=1 3 solution outp=resids2; 51 run; NOTE: The data set WORK.RESIDS2 has 79 observations and 17 variables. NOTE: The PROCEDURE MIXED printed page 6. NOTE: PROCEDURE MIXED used (Total process time): real time 0.09 seconds cpu time 0.07 seconds Forbes companies assets and sales Basic Analysis of Covariance using PROC MIXED Model for common slopes The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.FORBES LSales Diagonal REML Profile Model-Based Residual Class Level Information Class Levels Values sector 9 Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Number Number Number Number of of of of 1 11 0 1 79 Observations Observations Read Observations Used Observations Not Used 79 79 0 Covariance Parameter Estimates Cov Parm Estimate Residual 0.2421 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 120.5 122.5 122.6 124.7 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 232 Solution for Fixed Effects Effect Intercept LAssets sector sector sector sector sector sector sector sector sector sector Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Estimate 2.0688 0.7674 -0.6583 -0.8773 -2.0317 0.1152 0.04720 -0.6562 -0.2378 0.6164 0 Standard Error 0.4386 0.05151 0.4106 0.2385 0.2396 0.2680 0.2550 0.3214 0.2740 0.2541 . Type 1 Tests of Fixed Effects Num Den Effect DF DF F Value LAssets 1 69 158.26 sector 8 69 29.95 t Value 4.72 14.90 -1.60 -3.68 -8.48 0.43 0.19 -2.04 -0.87 2.43 . Pr > |t| <.0001 <.0001 0.1134 0.0005 <.0001 0.6686 0.8537 0.0450 0.3886 0.0179 . Pr > F <.0001 <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value LAssets 1 69 221.98 sector 8 69 29.95 DF 69 69 69 69 69 69 69 69 69 69 . Pr > F <.0001 <.0001 52 PROC UNIVARIATE DATA=Resids2 PLOT NORMAL; VAR resid; RUN; NOTE: The PROCEDURE UNIVARIATE printed page 7. NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 0.07 seconds cpu time 0.03 seconds 52 ! QUIT; Forbes companies assets and sales Basic Analysis of Covariance using PROC MIXED Model for common slopes The UNIVARIATE Procedure Variable: Resid Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 79 0 0.46279456 1.15266833 16.7059468 . Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean Basic Statistical Measures Location Variability Mean 0.00000 Std Deviation Median -0.03989 Variance Mode . Range Interquartile Range Tests for Location: Mu0=0 Test -StatisticStudent's t t 0 Sign M -4.5 Signed Rank S -141.5 79 0 0.21417881 3.07219892 16.7059468 0.05206846 0.46279 0.21418 2.69383 0.53176 -----p Value-----Pr > |t| 1.0000 Pr >= |M| 0.3682 Pr >= |S| 0.4927 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Tests for Normality Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling Appendix 11 --Statistic--W 0.931439 D 0.100788 W-Sq 0.161384 A-Sq 1.072376 SAS Example Page 233 -----p Value-----Pr < W 0.0004 Pr > D 0.0461 Pr > W-Sq 0.0178 Pr > A-Sq 0.0081 Quantiles (Definition 5) Quantile Estimate 100% Max 1.8147346 99% 1.8147346 95% 0.8640185 90% 0.4790031 75% Q3 0.2098847 50% Median -0.0398903 25% Q1 -0.3218754 10% -0.5231454 5% -0.7197238 1% -0.8790939 0% Min -0.8790939 Extreme Observations ------Lowest-----Value Obs -0.879094 78 -0.796200 49 -0.767491 39 -0.719724 60 -0.687779 17 Stem 18 16 14 12 10 8 6 4 2 0 -0 -2 -4 -6 -8 Leaf 1 4 ------Highest----Value Obs 0.664353 43 0.864019 75 1.136748 58 1.437300 16 1.814735 11 Boxplot 1 * 1 0 4 1 6 1 26 2 56783 5 115892479 9 025789900114578 15 987643211108655443211 21 6422294 7 63298722110 11 729 3 80 2 ----+----+----+----+Multiply Stem.Leaf by 10**-1 0 | | | +-----+ | + | *-----* +-----+ | | | Normal Probability Plot 1.9+ * | | * | | * ++++ | *++++ | ++**+ 0.5+ ++**** | ++***** | ++***** | ******* | ***+ | ******* | * ***+++ -0.9+ * ++++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 234 54 proc mixed data=forbes; classes sector; 55 title2 'Basic Analysis of Covariance using PROC MIXED'; 56 title3 'SLR - a model for common slopes and intercepts'; 57 model LSales = LAssets / htype=1 3 solution outp=resids2; 58 run; NOTE: The data set WORK.RESIDS2 has 79 observations and 17 variables. NOTE: The PROCEDURE MIXED printed page 8. NOTE: PROCEDURE MIXED used (Total process time): real time 0.09 seconds cpu time 0.06 seconds Forbes companies assets and sales Basic Analysis of Covariance using PROC MIXED SLR - a model for common slopes and intercepts The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.FORBES LSales Diagonal REML Profile Model-Based Residual Class Level Information Class Levels Values sector 9 Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Number Number Number Number of of of of 1 2 0 1 79 Observations Observations Read Observations Used Observations Not Used 79 79 0 Covariance Parameter Estimates Cov Parm Estimate Residual 0.9702 Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) 225.3 227.3 227.4 229.6 Solution for Fixed Effects Standard Effect Estimate Error Intercept 3.0000 0.7394 LAssets 0.5776 0.09191 DF 77 77 t Value 4.06 6.28 Pr > |t| 0.0001 <.0001 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 Type 1 Tests of Fixed Effects Num Den Effect DF DF F Value LAssets 1 77 39.49 Pr > F <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value LAssets 1 77 39.49 SAS Example Page 235 Pr > F <.0001 59 options ls=111 ps=56; 60 proc plot data=resids1; plot resid*Assets=sector / vref=0; 61 title2 'Residual plot with separate slopes'; run; NOTE: There were 79 observations read from the data set WORK.RESIDS1. NOTE: The PROCEDURE PLOT printed page 9. NOTE: PROCEDURE PLOT used (Total process time): real time 0.04 seconds cpu time 0.00 seconds Plot of Resid*Assets. Symbol is value of sector. Resid | 2.0 + | | | | | F | F 1.5 + | | | | | | 1.0 + | E | | E E | R | | E 0.5 + M | R | H | F E F | MO M O F | OTR H | H TE H 0.0 +---O-MR-RC-M--------F---------C---------------------------------------------------------| TM EMMHT F | TM H H F | E F | MHO O | R FF E | E F F -0.5 + F M | R F | E | E | | R E | -1.0 + | ---+-------------+-------------+-------------+-------------+-------------+-------------+-0 10000 20000 30000 40000 50000 60000 Assets NOTE: 11 obs hidden. James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 236 62 proc plot data=resids2; plot resid*Assets=sector / vref=0; 63 title2 'Residual plot with common slopes'; run; 64 65 options ls=80 ps=256; NOTE: There were 79 observations read from the data set WORK.RESIDS2. NOTE: The PROCEDURE PLOT printed page 10. NOTE: PROCEDURE PLOT used (Total process time): real time 0.04 seconds cpu time 0.00 seconds Forbes companies assets and sales Residual plot with common slopes Plot of Resid*Assets. Symbol is value of sector. Resid | | 2.0 + | R | | | 1.5 + R H | H | M | R R R | E 1.0 + R M | | M R H H | T | MT HM O 0.5 + H M F | OM C | M R EM M | OOEO | 0.0 +--F---O--------------E------------------------------------------------------------------| HO C | M E | E | -0.5 + M | | E | | M E E F F -1.0 + E | | E | F F | FF -1.5 + F | | FF | F F | F -2.0 + F | ---+-------------+-------------+-------------+-------------+-------------+-------------+-0 10000 20000 30000 40000 50000 60000 Assets NOTE: 14 obs hidden. James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 237 66 proc glm data=forbes; classes sector; 67 title2 'Basic Analysis of Covariance using PROC GLM'; 68 model Sales = Assets sector Assets*sector / solution; 69 run; 70 NOTE: The PROCEDURE GLM printed pages 11-12. NOTE: PROCEDURE GLM used (Total process time): real time 0.12 seconds cpu time 0.06 seconds Forbes companies assets and sales Basic Analysis of Covariance using PROC GLM The GLM Procedure Class Level Information Class Levels Values sector 9 Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Number of Observations Read Number of Observations Used 79 79 Forbes companies assets and sales Basic Analysis of Covariance using PROC GLM The GLM Procedure Dependent Variable: Sales Source Model Error Corrected Total R-Square 0.927996 DF 17 61 78 Coeff Var 50.91922 Sum of Squares 3558598509 276115472 3834713980 Root MSE 2127.553 Mean Square 209329324 4526483 F Value 46.25 Pr > F <.0001 Sales Mean 4178.291 Source Assets sector Assets*sector DF 1 8 8 Type I SS 2136740168 798241282 623617059 Mean Square 2136740168 99780160 77952132 F Value 472.05 22.04 17.22 Pr > F <.0001 <.0001 <.0001 Source Assets sector Assets*sector DF 1 8 8 Type III SS 345225592.1 23240004.8 623617058.8 Mean Square 345225592.1 2905000.6 77952132.3 F Value 76.27 0.64 17.22 Pr > F <.0001 0.7396 <.0001 Parameter Intercept Assets sector sector sector sector sector Communication Energy Finance HiTech Manufacturing Estimate 941.631862 0.847450 -635.413295 -1407.857077 -1937.037904 -121.048376 -34.461603 B B B B B B B Standard Error 1429.347474 0.457762 3281.447475 1627.073765 1585.526729 1700.741989 1783.626942 t Value 0.66 1.85 -0.19 -0.87 -1.22 -0.07 -0.02 Pr > |t| 0.5125 0.0690 0.8471 0.3903 0.2265 0.9435 0.9846 James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance sector sector sector sector Assets*sector Assets*sector Assets*sector Assets*sector Assets*sector Assets*sector Assets*sector Assets*sector Assets*sector Medical Other Retail Transportation Communication Energy Finance HiTech Manufacturing Medical Other Retail Transportation Appendix 11 -801.439349 -404.342365 -64.612409 0.000000 -0.403859 -0.101443 -0.578685 0.108044 0.085413 -0.205210 -0.203991 1.353426 0.000000 SAS Example Page 238 B B B B B B B B B B B B B 1947.191925 1742.284603 1951.219464 . 0.501359 0.477166 0.459852 0.459839 0.496492 0.620651 0.495649 0.678962 . -0.41 -0.23 -0.03 . -0.81 -0.21 -1.26 0.23 0.17 -0.33 -0.41 1.99 . 0.6821 0.8173 0.9737 . 0.4236 0.8324 0.2130 0.8150 0.8640 0.7421 0.6821 0.0507 . NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. 71 72 73 73 74 75 76 *GOPTIONS DEVICE=win; GOPTIONS DEVICE=CGMLT97L GSFMODE=REPLACE ftext='TimesRoman' ftitle='TimesRoman' htext=1 htitle=1 ctitle=black ! ctext=black; GOPTIONS GSFNAME=OUT0; FILENAME OUT0 'C:\SAS\forbes1.CGM'; 77 PROC GPLOT DATA=forbes; 78 title1 'Forbes companies assets and sales'; 79 title2 'Single line for all sectors'; 80 PLOT LSales*LAssets=1 LSales*LAssets=2 / HAXIS=AXIS1 VAXIS=AXIS2 80 ! OVERLAY; 81 AXIS1 LABEL=(H=1 'Assets (Log)') VALUE=(H=1) ORDER=5 TO 11 BY 1; 82 AXIS2 LABEL=(ANGLE=90 H=1 'Sales (Log)') 83 VALUE=(H=1) MINOR=(N=2) ORDER=5 TO 11 BY 1; 84 SYMBOL1 color=blue V=Dot I=None L=1 MODE=INCLUDE; 85 SYMBOL2 color=red V=None I=RLcli95 L=1 MODE=INCLUDE; 86 RUN; NOTE: Regression equation : LSales = 3.000028 + 0.577576*LAssets. NOTE: 77 RECORDS WRITTEN TO C:\SAS\forbes1.CGM NOTE: There were 79 observations read from the data set WORK.FORBES. NOTE: PROCEDURE GPLOT used (Total process time): real time 0.29 seconds cpu time 0.12 seconds The models fitted above are used for testing the differences between the slopes and intercepts. These are the correct models for testing, but will yield differences between the slopes and intercepts and standard errors of differences, not the actual slope and intercept values. These can be obtained in SAS, but this model tests intercepts jointly against zero and slopes jointly against zero, and does not test for differences. This is not likely to be the test we want. See me if you need a model that actually fits the slopes and intercepts with pooled standard errors. James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance Appendix 11 SAS Example Page 239 11 Forbes companies assets and sales Single line for all sectors Sales (Log) 10 9 8 7 6 5 5 6 7 8 9 10 11 Assets (Log) 88 data forbes; set forbes; 89 if sector = 'Communication' then C = Lsales; 90 if sector = 'Energy' then E = Lsales; 91 if sector = 'Finance' then F = Lsales; 92 if sector = 'HiTech' then H = Lsales; 93 if sector = 'Manufacturing' then M = Lsales; 94 if sector = 'Medical' then D = Lsales; 95 if sector = 'Other' then O = Lsales; 96 if sector = 'Retail' then R = Lsales; 97 if sector = 'Transportation' then T = Lsales; 98 run; NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 99 100 GOPTIONS GSFNAME=OUT1; 101 FILENAME OUT1 'C:\SAS\forbes2.CGM'; 102 PROC GPLOT DATA=forbes; 103 title1 'Forbes companies assets and sales'; 104 title2 'Separate line for each sector'; 105 PLOT C*LAssets=1 E*LAssets=2 F*LAssets=3 H*LAssets=4 M*LAssets=5 D*LAssets=6 106 107 108 O*LAssets=7 R*LAssets=8 T*LAssets=9 / HAXIS=AXIS1 VAXIS=AXIS2 OVERLAY; AXIS1 LABEL=(H=1 'Assets (Log)') VALUE=(H=1) ORDER=5 TO 11 BY 1; AXIS2 LABEL=(ANGLE=90 H=1 'Sales (Log)') James P. Geaghan - Copyright 2011 Statistical Techniques II Analysis of Covariance 109 110 111 112 113 114 115 116 117 118 119 Appendix 11 SAS Example Page 240 VALUE=(H=1) MINOR=(N=2) ORDER=5 TO 11 BY 1; SYMBOL1 color=red V=None I=RL L=1 MODE=INCLUDE; SYMBOL2 color=orange V=None I=RL L=1 MODE=INCLUDE; SYMBOL3 color=green V=None I=RL L=1 MODE=INCLUDE; SYMBOL4 color=blue V=None I=RL L=1 MODE=INCLUDE; SYMBOL5 color=brown V=None I=RL L=1 MODE=INCLUDE; SYMBOL6 color=yellow V=None I=RL L=1 MODE=INCLUDE; SYMBOL7 color=cyan V=None I=RL L=1 MODE=INCLUDE; SYMBOL8 color=magenta V=None I=RL L=1 MODE=INCLUDE; SYMBOL9 color=black V=None I=RL L=1 MODE=INCLUDE; RUN; 11 Forbes companies assets and sales Separate line for each sector Sales (Log) 10 9 8 7 6 5 5 6 7 8 9 10 11 Assets (Log) James P. Geaghan - Copyright 2011 ...
View Full Document

This note was uploaded on 12/29/2011 for the course EXST 7015 taught by Professor Wang,j during the Fall '08 term at LSU.

Ask a homework question - tutors are online