Unformatted text preview: 1 SAS Intro 1. What is SAS?
2. Course info
3. Installing and running SAS on your computer
4. Three windows: editor, log, output
5. Importing an Excel spreadsheet
6. SAS program structure: DATA and PROC steps
7. Using the editor
8. Finding errors in the program
9. Help ﬁles
10. Clearing the log and output: F keys
11. Why write code?
1 What is SAS? (pronounced sass) SAS was founded in 1976, based on code written 1966–1976, and originally meant
Statistical Analysis System. Users were mainly academics doing statistical analysis.
2010: SAS Institute, Cary NC, had annual revenue of $2.4 billion. Competes in
business intelligence software:
Credit card companies, for example, use SAS to detect unusual buying patterns in
real time, and to spot potentially fraudulent charges. Giant retail chains use SAS to
tailor pricing and product offerings down to the store level. Telecommunications
companies use SAS to identify the few thousand customers, among millions, most
likely to switch to another cellphone carrier, and to aim marketing at them. SAS
software is also used to parse sensor signals from North Sea oil rigs, combined
with weather and structural data, to predict failure of parts before it happens. Of
the 100 largest companies worldwide, 92 use SAS software.(NY Times, 2009) 2 Competitors in business intelligence software are IBM, Oracle, SAP, Microsoft.
The competitive thrust that really grabbed SAS’s attention came in late July 2009,
when I.B.M. announced that it planned to pay $1.2 billion for SPSS, a maker of
predictive modeling software. I.B.M. has placed SPSS and Cognos into a new
business analytics and optimization group. That business will be supported by 200
scientists, and the company has said it will retrain or hire 4,000 consultants and
analysts to work in the group.
“This is the big growth strategy for I.B.M., the company’s next big play for this
decade,” says Ambuj Goyal, a computer scientist who is general manager of
I.B.M’s business analytics software unit. “SAS comes from the legacy world of
statisticians and programmers. The real opportunity is in deploying this technology
broadly in corporations.” (NY Times, 2009) 3 SAS software continues to be the standard used in statistical analysis of clinical
pharmaceutical trials for submission to the Food and Drug Administration. It is also widely
used for statistical analysis in the insurance industry and the ﬁeld of public health, at least
partially due to the handling of different types and formats of data. . . .
SAS also provides data mining, data warehousing, business intelligence, sustainability
and business performance management software. Because of this wide spectrum, many
users are expert in one area of the SAS package, but have little or no experience in
another. (Wikipedia, 2011) 4 Course Syllabus Course Information PubH 6470section 1
Credits: SAS Procedures & Data Analysis 4 Meeting Days and Place: Fall 2011 Course Syllabus TuTh 11:15–12:30, Moos 2520 Course Website: tinyurl.com/pubh6470 or http://www.biostat.umn.edu/~will/ph6470.html
Instructor: William Thomas PubH 6470section Mayo
1
Office Address:
A467
Office Phone:
Credits: Fax: 6126260660 Office Hours: Wednesdays 2:30–3:30 SAS ProcedurEmail: Data Analysis
es &
Fall 2011
[email protected] 6126250651 4 Meeting Days and Place: I. Course Description II. TuTh 11:15–12:30, Moos 2520 Course Prerequisites Course Website: tinyurl.com/pubh6470 or http://www.biostat.umn.edu/~will/ph6470.html
PubH 6470 introduces students with a background in statistics to programming, graphics, and data analysis
using
Instructor: SAS. The course concentrates on programming using PCSAS, data editing and reformatting, as well
William Thomas
Fax:
6126260660
as statistical applications of SAS procedures: general linear models, logistic regression, longitudinal mixedOffice Address:
A 467 Mayo
Email:
[email protected]
effects models,and survival data.
Office Phone:
6126250651
Office Hours:
Wednesdays 2:30–3:30 I. A one year course in applied statistics at the level of Pubh 6451, Pubh 7406, Stat 5021 or Stat 5302, or
permission of the instructor.
Course Description III.PubH 6470 Goals andstudents with a background in statistics to programming, graphics, and data analysis
Course introduces Objectives II. using SAS. The course concentrates on programming using PCSAS, data editing and reformatting, as well
as To provide applications of SAS procedures: general linear(i) identifylogistic regression,problems in data; (ii)
statistical students experience and practice using SAS to models, and fix errors and longitudinal mixedwrite models, and survival datamake and revise graphics; (iv) manage a statistical analysis project and
effects and debug programs; (iii) .
5
report their results; and learn to use SAS and internet resources to solve problems and learn new SAS
procedures. Course Prerequisites IVA one year course in applied and Work Expectations 6451, Pubh 7406, Stat 5021 or Stat 5302, or
. Methods of Instruction statistics at the level of Pubh
permission of the instructor.
Lectures, homework assignments, takehome exams. Assignments will be due at the start of the class, and
we will immediately discuss the problems and answers so please make a photocopy for reference during III. Course Goals and Objectives accepted. Students may discuss homeworks and final projects with other
class. Late homework will not be
lassmates but you must write your own SAS programs, and your homework and exams must be in data; (ii)
Tocprovide students experience and practice using SAS to (i) identify and fix errors and problems completed
independently.
write and debug programs; (iii) make and revise graphics; (iv) manage a statistical analysis project and
report their results; and learn to use SAS and internet resources to solve problems and learn new SAS
V.procedures.Text and Readings
Course
Students are encouraged to purchase PCSAS version 9.2 through the University of Minnesota for $75/year. IV. Methods of Instruction and Work Expectations On Reserve at the Biomed Library:
Lectures, homework assignments,Delwiche and Slaughter, 2008, SAS Institute at the start of the class, and
• The Little SAS Book 4th ed. by takehome exams. Assignments will be due
we will immediately discuss the problems and answers so please make a photocopy for reference during
• SAS Programming for Researchers and Social Scientists by Phil Spector, Sage
class.ata Analysis Usingwill not be accepted. Students may discuss homeworks and final projects with other
• D Late homework Regression and Multilevel/Hierarchical Models by Gelman and Hill, 2007, Cambridge
classmates butClinical Trials Using SownDmitrienko, Molenberghs, ChuangStein, Offen, 2005, SAS completed
• Analysis of you must write your AS SAS programs, and your homework and exams must be Institute
independently.
• SAS for Mixed Models, 2nd ed. by Littell, Milliken, Stroup, Wolfinger, Schabenburger, 2006, SAS Institute
• Logistic regression using the SAS system: theory and application by Paul Allison, 2001, WileySAS
V. C• Categorical and Readings
ourse Text Data Analysis Using the SAS System, 2nd ed. by Stokes, Davis, and Koch, 2009, SAS
Institute
Students are encouraged to purchase PCSAS version 9.2 through the University of Minnesota for $75/year.
1
On Reserve at the Biomed Library:
• The Little SAS Book 4th ed. by Delwiche and Slaughter, 2008, SAS Institute
• SAS Programming for Researchers and Social Scientists by Phil Spector, Sage
• Data Analysis Using Regression and Multilevel/Hierarchical Models by Gelman and Hill, 2007, Cambridge
• Analysis of Clinical Trials Using SAS Dmitrienko, Molenberghs, ChuangStein, Offen, 2005, SAS Institute
• SAS for Mixed Models, 2nd ed. by Littell, Milliken, Stroup, Wolfinger, Schabenburger, 2006, SAS Institute
• Logistic regression using the SAS system: theory and application by Paul Allison, 2001, WileySAS
• Categorical Data Analysis Using the SAS System, 2nd ed. by Stokes, Davis, and Koch, 2009, SAS
Institute
1 6 Fall 2011
T–Th 11:15–12:30, Mayo 2520 W Thomas Syllabus for PubH 6470: SAS Procedures and Data Analysis
Date
September Topics
6
8 1 Intro to PCSAS: importing a spreadsheet, working with 3 windows
2 Basic tests, arithmetic and comparisons, missing values Fall 2011
T–Th 11:15–12:30, 3 Character variables, SET and MERGE, DOloops, standardizing a variable
Mayo 2520
•13
15
20
22 W Thomas 4 Checking data, Proc Insight graphics, data set options, more on MERGE 5 Working with dates, simple SAS Procedures and Data from baseline
Syllabus for PubH 6470: macros, arrays, computing change Analysis
6 Graphics: SGplot, Gplot Date 27
•
September 29
6 Topics
7 Linear models: math scores, Proc Corr plots, outliers, log transformation
NoIntro to PCSAS: importing a spreadsheet, working with 3 windows
1 class 8
4
6
•13 2 Basic tests, arithmetic and comparisons, missing values
8 Proc Reg: subset selection, ODS plots, ﬁtted values
9 Proc GLM: indicator variables, MERGE, DOloops, standardizing a variable
3 Character variables, SET and class variables 22
18
•20
27 6 Linear models: , Gplot
12 Graphics: SGplotbacktransformation, ODS select and ODS output; program structure
13 Linear models: mediation; readingCorr plots, outliers, problems; converting variable type
7 Confounding, math scores, Proc a spreadsheet with log transformation October 15
• 11
13
20 29
• 25
October 27
4
6 November • 11
1
3
13
8
18
10
20 ••15
25
17
27
22
November 24
1
3
29
December 1
8
10
•6
8
• 15
17
13 22
December 20
24 4 Multifactor ANOVA: rat diets, LSmeans, set options, more
10 Checking data, Proc Insight graphics, data interaction plot on MERGE
11 Working with dates, simple macros, arrays, computing change from baseline
5 Proc GLM: What are LSmeans? No class
14 Bootstrap, Proc Surveyselect
15 Proc Reg: subset missing data and imputation
8 Bootstrap tests; selection, ODS plots, ﬁtted values
Midterm exam handed out; dueclass variables
9 Proc GLM: indicator variables, 1 Nov in class 16 Logistic regression: Proc Logistic, odds ratios, interactions, diagnostics
10 Multifactor ANOVA: rat diets, LSmeans, interaction plot
17 Percent correctly are LSmeans?
11 Proc GLM: What classiﬁed, subset selection, conditional logistic regression,
18 Logbinomial regression, ordinal regression (more than 2 categories of response)
7
12 Linear models: backtransformation, ODS select and ODS output; program structure
19 Propensity scores and matching a spreadsheet with problems; converting variable type
13 Confounding, mediation; reading
20 Proc SQL, fuzzy merge; Longitudinal data: plots, long vs wide format,
14 Bootstrap, Proc Surveyselect
21 Longitudinal data: Proc Transpose, imputation the curve (AUC)
15 Bootstrap tests; missing data and area under
Midterm exam data: Proc Mixed, 1 Nov in class
22 Longitudinal handed out; due correlation matrix, random effects, mixedeffects models
Thanksgiving: no classProc Logistic, odds ratios, interactions, diagnostics
16 Logistic regression:
17 Percent correctly mixed model example (rat diets again)
23 Longitudinal data:classiﬁed, subset selection, conditional logistic regression,
24 Crossover designs
18 Logbinomial regression, ordinal regression (more than 2 categories of response)
19 Propensity scores and matching
25 Survival data
26 Proportional hazards regression
20 Proc SQL, fuzzy merge; Longitudinal data: plots, long vs wide format,
21 Longitudinal data: Proc Transpose, area under the curve (AUC)
Proportional hazards regression; ﬁnal exam handed out
22 Longitudinal data: Proc Mixed, correlation matrix, random effects, mixedeffects models
Final Exam due at 4:00 in Mayo A460 (Biostat Ofﬁce)
Thanksgiving: no class • homework 29 23 Longitudinal data: mixed model example (rat diets again)
assignment due (tentative)
December 1
24 Crossover designs
•6
8 25 Survival data
26 Proportional hazards regression 13 Proportional hazards regression; ﬁnal exam handed out December 20 Final Exam due at 4:00 in Mayo A460 (Biostat Ofﬁce) • homework assignment due (tentative) 8 VII. Evaluation and Grading
The final grade will be based on homework assignments (60%) and takehome midterm and final exams
(20% each). The curve for final grades will be: A = 95–100; A = 90–94; B+ = 85–89; B = 80–84; B = 75–
79; C+ = 70–74; C = 65–69; C = 60–64; F = below 60. For those registered S/N, S = 60100. Depending
on how the final course averages turn out, I may lower some grade lines, but I will not raise them.
Incomplete Contracts
A grade of incomplete “I” shall be assigned at the discretion of the instructor when, due to extraordinary
circumstances (e.g., documented illness or hospitalization, death in family, etc.), the student was prevented
from completing the work of the course on time. The assignment of an “I” requires that a contract be initiated
and completed by the student before the last official day of class, and signed by both the student and
instructor. If an incomplete is deemed appropriate by the instructor, the student in consultation with the
instructor, will specify the time and manner in which the student will complete course requirements.
Extension for completion of the work will not exceed one year (or earlier if designated by the student’s
college). For more information and to initiate an incomplete contract, students should go to SPHGrades at:
www.sph.umn.edu/grades. The written Incomplete Contract must be registered with the Student
Services Center prior to the date by which grades must be entered at the end of the term.
Course Evaluation
Beginning in fall 2008, the SPH will collect student course evaluations electronically using a software system
called CoursEval: www.sph.umn.edu/courseval. The system will send email notifications to students when
they can access and complete their course evaluations. Students who complete their course evaluations
promptly will be able to access their final grades just as soon as the faculty member renders the grade in
SPHGrades: www.sph.umn.edu/grades. All students will have access to their final grades through OneStop
two weeks after the last day of the semester regardless of whether they completed their course evaluation or
not. Student feedback on course content and faculty teaching skills are an important means for improving our
work. Please take the time to complete a course evaluation for each of the courses for which you are
registered.
University of Minnesota Uniform Grading and Transcript Policy
A link to the policy can be found at onestop.umn.edu. VIII. Other Course Information and Policies 9 Grade Option Change (if applicable)
For fullsemester courses, students may change their grade option, if applicable, through the second week of
the semester. Grade option change deadlines for other terms (i.e. summer and halfsemester courses) can
be found at onestop.umn.edu.
Course Withdrawal
Plagiarism is an important element of this policy. It is defined as the presentation of another's writing or ideas
Students should refer to the Refund and will result in a grade of he or "N" for erm at onestop.umn.edu for
as your own. Serious, intentional plagiarism Drop/Add Deadlines for t"F" particular tthe entire course. For more
information and deadlines for withdrawing from a course. As a courtesy, students should notify their
information on this policy and for a helpful discussion of preventing plagiarism, please consult University
instructor and, if applicable, advisor of their intent to withdraw.
policies and procedures regarding academic integrity: http://writing.umn.edu/tww/plagiarism/.
Students wishing to withdraw from a course after the noted final deadline for a particular term must contact
Students are urged to beHealth Student Services Center at [email protected] for further information. For
careful that they properly attribute and cite others' work in their own writing.
the School of Public
guidelines for correctly citing sources, go to http://tutorial.lib.umn.edu/ and click on “Citing Sources”.
Student Conduct, Scholastic Dishonesty and Sexual Harassment Policies
In addition, original work is expected in this course. Unless the instructor has specified otherwise, all
Students are responsible for knowing the University of Minnesota, Board of Regents' policy on Student
assignments, papers, reports, etc. should be the work of the individual student. It is unacceptable to hand in
Conduct and Sexual Harassment found at www.umn.edu/regents/polindex.html.
assignments for this course for which you receive credit in another course unless by prior agreement with the
Students are responsible for maintaining scholastic honesty in their work at all times. Students or final
instructor. Building on a line of work begun in another course or leading to a thesis, dissertation, engaged in
scholastic dishonesty will be penalized, and offenses will be reported to the SPH Associate Dean for
project is acceptable.
Academic Affairs who may file a report with the University’s Academic Integrity Officer.
Disability Statement
The University’s to provide, on a Code defines scholastic dishonesty as “plagiarizing; cheating on
It is University policy Student Conduct flexible and individualized basis, reasonable accommodations to
assignments o examinations; disability (e.g., physical, collaboration on academic work; taking, systemic)
students who haver a documented engaging in unauthorized learning, psychiatric, vision, hearing, oracquiring,
or using test materials without faculty permission; submitting false or incomplete records of academic
that may affect their ability to participate in course activities or to meet course requirements. Students with
achievement; acting alone or in cooperation with another to falsify records or to obtain dishonestly grades,
disabilities are encouraged to contact Disability Services to have a confidential discussion of their individual
honors, awards, or professional endorsement; or altering, forging, or misusing a University academic record;
needs for accommodations. Disability Services is located in Suite180 McNamara Alumni Center, 200 Oak
or fabricating or falsifying of data, research procedures, or data analysis.”
Street. Staff can be reached by calling 612/6261333 (voice or TTY).
2 10 Installing SAS on your computer PCSAS Version 9.2 (2008) is the version available through UM ($75/year).
Version 9.3 is latest release (2011).
PCSAS runs on these operating systems:
• Windows XP Professional, updated with Service Pack 2
• Windows Vista: Enterprise, Business, and Ultimate editions
• Red Hat Enterprise Linux, version 4 and 5 ($49)
• SuSE Linux Enterprise Server 9 and 10
On Macs, PCSAS runs on emulator (VM Ware, Parallels, Bootcamp). 11 PCSAS Demo Go to the course website: tinyurl.com/ph6470 Download to your desktop
• Excel spreadsheet Child IQ.xls
• SAS program Lecture01.sas 12 The 3 main windows (LSB [Little SAS Book] §1.6) Start the SAS program. Most of your work in SAS will be through the three main
windows:
• Enhanced Editor (not the Program Editor)
• Output
• Log
Click to let SAS ﬁll the screen, and resize the editor so it ﬁlls the right panel.
Now click on the button at the bottom that says Log  (Untitled). 13 NOTE: Copyright (c) 20022008 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.2 (TS1M0) Licensed to UNIVERSITY OF MINNESOTA  T&R, Site 0070006360.
NOTE: This session is executing on the XP_PRO If you don’t have Version 9.2, then you need to update.
www.oit.umn.edu/utools/mathematicsstatistics/index.htm 14 platform. Importing an Excel (.xls) spreadsheet (LSB §2.3) First, select the editor window. Choose File > Import Data
which starts the SAS Import Wizard.
While it will open several types of ﬁles, the default is an Excel (.xls) spreadsheet,
so click Next .
(PCSAS will not open the new .xlsx worksheets.) Browse to ﬁnd the spreadsheet on the Desktop, then click Open .
The wizard will ask which sheet to import, and there are several important choices
under Options (defaults usually OK) 15 Finally, you must select a name for the data set (use the default Work library). Type a name for your dataset: Child_IQ
Click Finish. 16 Using Explorer to look at the data Clicking on Explorer in the lower left, then Libraries, opens a list of libraries. Double click on Work (the work library)
Then double click on Child_IQ to examine the data.
Data from Gelman and Hill, who cite the National Longitudinal Survey of Youth
(http://www.bls.gov/nls/home.htm).
child_iq = child’s IQ score
mom_hs_grad = 1 if mother graduated from high school, = 0 if not
mom_iq = mother’s IQ score; mom_age = mother’s age
17 Click to open the Enhanced Editor window in SAS.
In the SAS File menu, select File > Open program . . . and navigate to the Desktop to open Class01.sas To set preferences for the editor: Tools > Options > Enhanced Editor
Under General make sure that the checkbox for Clear text on submit is blank.
I like line numbers, and I disable “Collapsible code sections” by leaving its
checkbox blank. 18 SAS programs have sections: DATA steps and PROC steps.
Every statement ends with a semicolon (;)
SAS does not distinguish between upper and lower case in code,
but does distinguish in quoted strings, “Hello” DATA steps create or modify a data set, or combine data sets. data A;
set child_iq;
if (male=1) then gender="M";
if (male=0) then gender="F";
This data step takes our spreadsheet data, makes a copy called A,
and calculates a new variable (gender).
19 PROC steps perform a speciﬁc procedure on a data set. Proc Reg
data = A;
MODEL child_iq = mom_iq mom_age mom_HS_grad;
Proc Reg does ordinary least squares linear regression, using the model in the
MODEL statement. What does the next Proc Reg do? The program ends run; quit; 20 Colors in the code: (from Harvard S030 SAS Manual, p 44; ) Statements of the form * text ; are comments.
Sections of code can be “commented out” by /* (program statements) */
Errors are not always in red. 21 Running the SAS program The 4 buttons on the right, from left to right:
Running person: submit the program visible in the editor window
X: clear the active window
Circled ! : emergency stop, lets you cancel and quit a stuck program
Book: SAS documentation To run this program, choose Run > Submit, or click the running man.
Click the Output tab at the bottom to see the results.
22 The REG Procedure
Model: MODEL1
Dependent Variable: child_IQ child IQ
Number of Observations Read
Number of Observations Used 434
434 Analysis of Variance
DF Sum of
Squares Mean
Square 3
430
433 38881
141506
180386 12960
329.08285 Root MSE
Dependent Mean
Coeff Var 18.14064
86.79724
20.90002 Source
Model
Error
Corrected Total RSquare
Adj RSq F Value Pr > F 39.38 <.0001 0.2155
0.2101 23 Parameter Estimates
Variable Label Intercept
mom_IQ
mom_age
mom_HS_grad Intercept
mom IQ
mom age
mom HS grad Regression Coefﬁcients DF Parameter
Estimate Standard
Error t Value Pr > t 1
1
1
1 20.96284
0.56240
0.22629
5.64062 9.12284
0.06050
0.33062
2.25676 2.30
9.30
0.68
2.50 0.0221
<.0001
0.4941
0.0128 How are these regression coefﬁcients interpreted?
mom IQ mom age mom HS grad 24 Where’s the second regression? Check the log Click the Log tab at the bottom to open the log ﬁle.
ERROR: Data set WORK.A is not sorted in ascending sequence. The current BY
group has gender = M and the next BY group has gender = F.
NOTE: The SAS System stopped processing this step because of errors. Before doing anything BY the values of a variable, you must SORT the data on that
variable.
Add this code before the second Proc Reg Proc Sort data=a;
by gender;
and run the program again.
25 SAS Help and Documentation Files Each procedure has a chapter in the Help File. To look up Proc Reg:
SAS Products > SAS Procedures 26 Alternate path:
SAS Products > SAS/STAT > SAS/STAT User’s Guide > The REG Procedure 27 Every Procedure Help Chapter includes these sections:
Getting Started a simple example with data, code, output, and explanation
Syntax rules for writing programs: list of statements (= commands), with options
for each statement
Details Mathematical formulas underlying computations. Other SAS secrets.
Examples Analysis examples with data, code and output. Code can be copied and
pasted into the Editor window. Click Simple Linear Regression under Getting Started
How can we get those graphs? 28 Copy this code from the example: ods graphics on;
proc reg;
model Weight = Height;
run;
ods graphics off;
ODS = Output Delivery System
Paste this code into the Editing Window, just below the ﬁrst Proc Reg section.
Then cut and paste our Proc Reg in place of the example Proc Reg.
Now submit the program again. 29 Notice that SAS appends each output and the page numbering is continuous.
Default output is 132 characters wide, too wide to ﬁt on a standard sheet of paper.
Delete the * from the ﬁrst line of the program to “uncomment” the options
statement: options ls=80 pageno=1 nodate;
This restricts SAS output to 80 characters wide, restarts the page numbers with 1
each time we run the program, removes the date stamp. Run the program again to see the effects. 30 Clearing the log and output The third output is appended to the ﬁrst and second, and the same thing happens
in the Log ﬁle. Each time you run a program the output and log are appended to
the current ﬁles: confusing, and easy to read the wrong output.
So we need to clear out these ﬁles:
1. Submit program.
2. Review log ﬁle and output to correct errors.
3. Open output window. Choose Edit > Clear All or X
4. Open log window. Choose Edit > Clear All or X
5. Open editor window. Resubmit program. Go to step 2.
Lots of mouse clicks. We can set up the function keys to do these steps. 31 Setting up the F keys
Tools > Options > Keys brings up the actions associated with the F keys. F3 = “Submit the whole program or, if part is selected, submit the selected part”
F5 opens the Enhanced Editor (program) window.
F6 opens the log, and F7 opens the output window. Edit the line at F4 to read: log; clear; output; clear; wpgm
Now pressing F4 will clear both log and output windows and return you to the
editor. Then F3 submits the program.
Close the Keys window. 32 SAS input and output 1. Input: Read data and convert it to a SASreadable data set.
2. Input: Write a program ( analyze.sas ) that calls the data and performs
editing or statistical operations. Run your program.
3. Output:
(a) a log ﬁle ( analyze.log ) of information from the run and error messages
(b) if there is output, a listing ﬁle containing the output ( analyze.lst )
(c) for graphical output, sometimes a window containing the graphic(s) will
open. Save in various formats, or write directly to a ﬁle.
(d) data created by program is written as a SASreadable data set. 33 Where is the data? Both child_iq (imported) and A (created by the program) are in the Work library.
Explorer > Libraries > Work (in lower left of SAS window) Data in Work is temporary —it’s stored in memory, not saved to your disk or ﬂash
drive.
When you quit SAS, this data is deleted from memory and disappears from the
Work library. It’s inefﬁcient to import the same spreadsheet every time you want to analyze it.
We want to import once and then save the data permanently. 34 Permanent SAS datasets (LSB §2.19) For SAS to save a dataset permanently, SAS needs directions to a place to save it.
3 steps:
1. Make a folder to hold the data: a folder SAS Class on the desktop
or on your ﬂash drive, if you will be using U of M computers.
2. Give SAS the path to this folder:
In the SAS editor, click the ﬁle drawer with the blue star.
Name the library ("ph6470") and click enable at startup.
Browse to ﬁnd the folder SAS Class and click to open the folder. Click OK to create
the library.
ph6470, or the name you chose, will appear in the list of libraries. 35 3. Change the name of the dataset in the SAS program: Data ph6470.A;
... format is LIBRARYname.DATAname Import the spreadsheet again, to get child_iq into Work
Run the program (lecture01.sas) again, and A now appears in the library ph6470. We have created the permanent SAS dataset a.sas7bdat in the folder SAS Class.
Permanent SAS datasets from Version 9.2 use the extension .sas7bdat 36 Using a permanent data set Once a library has been set up and the ﬁle saved, then the ﬁle can be called by any
procedure as
data = LIBRARYname . DATAname The variable names are stored with the data, and no data step is needed to read a
permanent SAS ﬁle. 37 Backing up SAS programs On a PC: use software to back up your work regularly On an Mac: From the Mac side, Windows computer running inside Parallels or
VMware looks like one big ﬁle: Mac can’t see individual ﬁles on the Windows
machine.
Mac backups, such as Time Machine, can’t restore individual ﬁles inside the PC.
=) Store SAS code and data on the Mac side using shared folders. 38 Why write code? What’s wrong with menus? Many statistical packages allow one to perform an analysis simply by clicking
choices from menus. This is easy and often fast.
Problems arise later when you, or the next person in the job, needs to check a
result or a graph, or revise it:
there is no record of what data was used or what the menu choices were.
If your advisor asks, What are the sample sizes in this table?
or a journal editor asks for a change of units on your graph,
then you need to ﬁgure out how to recreate what you did (days or months ago). Programs document the process of data editing and analysis. 39 ...
View
Full
Document
This note was uploaded on 11/21/2011 for the course PUBH 6470 taught by Professor Williamthomas during the Fall '11 term at University of Florida.
 Fall '11
 WilliamThomas

Click to edit the document details