{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

# Handout08 - Lecture 8 1 OLS linear regression Proc Reg 2...

This preview shows pages 1–5. Sign up to view the full content.

Lecture 8 1. OLS linear regression: Proc Reg 2. Residual plots, ODS plots 3. Model reduction, subset selection 4. Predictions 1 How can we be sure our SAS code is correct? • Send problem observations to separate dataset for further investigation. • Check code with test data. • Print input data and output data and compare randomly chosen observations. • Break complex data step into parts and check results after each part. • Perform a check for the specific problem you suspect may occur. On course website: Seeing Red: Tips for Debugging the SAS Data Step by M. Lee 2

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
from JAMA 2005; 293:1861–1867 3 OGTT problem from HW 2 : one IF statement for each row of table. data A; set pubh.ogtt_hw2; if (sex="F") then sex="f"; missing = (min000=. or min030=. or min060=. or min090=. or min120=.); length OGTT_category \$20. ; OGTT_category = "problem"; * overwritten when calculated; middle = max(min030, min060, min090); if ( (0 LE min000 < 100) and (0 LE min120 < 140) and (middle < 200)) then OGTT_category="1_NGT"; if ((100 LE min000 LE 125) and (0 LE min120 < 140) and (middle < 200)) then OGTT_category="2_NGT+IFG"; if ((0 LE min000 < 100) and (0 LE min120 < 140) and (middle GE 200)) then OGTT_category="3_NGT-Indt"; 4
if ( (0 LE min000 < 100) and (140 LE min120 < 200)) then OGTT_category="4_IGT"; if ( (100 LE min000 LE 125) and (140 LE min120 < 200)) then OGTT_category="5_IGT+IFG"; if ( (0 LE min000 < 100) and (min120 GE 200)) then OGTT_category="6_CFRD no FH"; if ( (100 LE min000 LE 125) and (min120 GE 200)) then OGTT_category="7_CFRD no FH+IFG"; if ( (min000 GE 125) and (min120 GE 200)) then OGTT_category="8_CFRD+FH"; Result: 991 patients in categories, 28 problem/missing observations 5 Student questions: 1. NGT and NGT-IFG both require all three 30-60-90 mins to be <200. We weren’t sure how to do this beyond a series of AND statements so that it would only be included if all three were under 200 (and not missing). In your code you used the MAX function, which we read as finding the largest number in the set and making a constant for that observation. That constant was then compared with 200. However, if there were missing data, wouldn’t that be overlooked by the max function? 6

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2. You first labeled all OGGT_Category as "problem" and then overwrote that with new categories as you defined them. We looked over each individual piece, but we still weren’t getting the same numbers. We were suspicious that perhaps some observations could somehow fall into more than one category and so be overwritten twice. To test this, we ran the code after the OGGT_Category = "problem" in the reverse order. We ended up with 6 more categorized observations. 7 data C; set pubh.ogtt_hw2; if (sex="F") then sex="f"; missing = (min000=. or min030=. or min060=. or min090=. or min120=.); length OGTT_category \$20. ; OGTT_category = "problem"; middle = max(min030, min060, min090); if ( (min000 GE 125) and (min120 GE 200)) then OGTT_category="8_CFRD+FH"; if ( (100 LE min000 LE 125) and (min120 GE 200)) then OGTT_category="7_CFRD no FH+IFG"; if ( (0 LE min000 < 100) and (min120 GE 200)) then OGTT_category="6_CFRD no FH"; if ( (100 LE min000 LE 125) and (140 LE min120 < 200)) then OGTT_category="5_IGT+IFG"; if ( (0 LE min000 < 100) and (140 LE min120 < 200)) then OGTT_category="4_IGT"; if ( (0 LE min000 < 100) and (0 LE min120 < 140) and (middle GE 200))
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}