This preview shows pages 1–5. Sign up to view the full content.
Lecture 8
1. OLS linear regression:
Proc Reg
2. Residual plots, ODS plots
3. Model reduction, subset selection
4. Predictions
1
How can we be sure our SAS code is correct?
• Send problem observations to separate dataset for further investigation.
• Check code with test data.
• Print input data and output data and compare randomly chosen observations.
• Break complex data step into parts and check results after each part.
• Perform a check for the speciFc problem you suspect may occur.
On course website:
Seeing Red: Tips for Debugging the SAS Data Step
by M. Lee
2
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document from
JAMA
2005; 293:1861–1867
3
OGTT problem from HW 2
: one IF statement for each row of table.
data A;
set pubh.ogtt_hw2;
if (sex="F") then sex="f";
missing = (min000=. or min030=. or min060=. or min090=. or min120=.);
length OGTT_category $20. ;
OGTT_category = "problem";
* overwritten when calculated;
middle = max(min030, min060, min090);
if ( (0 LE min000 < 100) and (0 LE min120 < 140) and (middle < 200))
then OGTT_category="1_NGT";
if ((100 LE min000 LE 125) and (0 LE min120 < 140) and (middle < 200))
then OGTT_category="2_NGT+IFG";
if ((0 LE min000 < 100) and (0 LE min120 < 140) and (middle GE 200))
then OGTT_category="3_NGTIndt";
4
if ( (0 LE min000 < 100) and (140 LE min120 < 200))
then OGTT_category="4_IGT";
if ( (100 LE min000 LE 125)
and (140 LE min120 < 200))
then OGTT_category="5_IGT+IFG";
if ( (0 LE min000 < 100) and (min120 GE 200))
then OGTT_category="6_CFRD no FH";
if ( (100 LE min000 LE 125) and (min120 GE 200))
then OGTT_category="7_CFRD no FH+IFG";
if ( (min000 GE 125) and (min120 GE 200))
then OGTT_category="8_CFRD+FH";
Result: 991 patients in categories, 28 problem/missing observations
5
Student questions:
1.
NGT and NGTIFG both require all three 306090 mins to be <200. We weren’t
sure how to do this beyond a series of AND statements so that it would only be
included if all three were under 200 (and not missing).
In your code you used the
MAX
function, which we read as ±nding the largest
number in the set and making a constant for that observation. That constant
was then compared with 200. However, if there were missing data, wouldn’t
that be overlooked by the max function?
6
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document 2.
You frst labeled all
OGGT_Category
as "problem" and then overwrote that with
new categories as you defned them.
We looked over each individual piece, but we still weren’t getting the same
numbers. We were suspicious that perhaps some observations could somehow
Fall into more than one category and so be overwritten twice. To test this, we
ran the code aFter the
OGGT_Category = "problem"
in the reverse order. We
ended up with 6 more categorized observations.
7
data C;
set pubh.ogtt_hw2;
if (sex="F") then sex="f";
missing = (min000=. or min030=. or min060=. or min090=. or min120=.);
length OGTT_category $20. ;
OGTT_category = "problem";
middle = max(min030, min060, min090);
if ( (min000 GE 125) and (min120 GE 200)) then OGTT_category="8_CFRD+FH";
if ( (100 LE min000 LE 125) and (min120 GE 200)) then OGTT_category="7_CFRD no FH+IFG";
if ( (0 LE min000 < 100) and (min120 GE 200)) then OGTT_category="6_CFRD no FH";
if ( (100 LE min000 LE 125)
and (140 LE min120 < 200)) then OGTT_category="5_IGT+IFG";
if ( (0 LE min000 < 100) and (140 LE min120 < 200)) then OGTT_category="4_IGT";
if ( (0 LE min000 < 100) and (0 LE min120 < 140) and (middle GE 200))
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 11/21/2011 for the course PUBH 6470 taught by Professor Williamthomas during the Fall '11 term at University of Florida.
 Fall '11
 WilliamThomas

Click to edit the document details