This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Statistics 5106, Spring 2007 Notes 1 SAS Windows The format for running SAS code is much the same in each platform. There are three main windows to organize the various information. You type SAS code in the Editor window. After you run this code, any output appears in the Output window. The Log window contains information about the results of your code, including error messages. SAS Datasets A SAS dataset consists of observations (rows), variables (columns), and labels, all stored in a way intelligible to SAS software, but you cant look at it as you would a text file. Information in a SAS dataset is accessed through print procedures and summarized or analyzed in other statistics procedures. For the first few of days in this class we will create datasets, either by typing in values or reading files. Well learn how to save datasets for later use or transport, combine datasets, and use SAS datasets that are available from many sources on the web. Further, well learn how to use simple procedures for statistical summaries and plots of the data. Part of the appeal of SAS is that it is widely used and very versatile. These properties, though, make it cumbersome to learn. Many people agree that its a pain to learn, but good to know. The exposition in this class will be very hands on, and well try to motivate every new item with examples. Most of the datasets at the beginning will be fake, constructed with specific pedagogical intention, but later in the semester, we will be looking at some interesting real data in SAS. Data Step To construct a SAS dataset you name the dataset, name the variables, and read in some numbers. The first example SAS code shown below creates a dataset called xyz containing information about employees at XYZ corporation. The name of the dataset is xyz ; this is defined in the first line of the code. The second line defines the variables to be input into the dataset. The five variables are called type, id, gender, yrstart, and salary. The first variable takes on character (rather than numerical) values: E=executive, S=support staff, and T=technical staff. The dollar sign after the variable name signals that this variable has character values. The statement cards tells SAS, here come the data. The spaces between the values allow SAS to determine which numbers to assign to which variables: of course they have to be in the same order as the list on the input line. Note all the semicolons all over the place. The semicolon means this command finished. It has to go after every command, and typically this is after every line, if we have one line per command. data xyz; input type $ id gender $ yrstart salary; cards; T 128188 M 1998 35.6 S 119587 F 1999 28.7 S 110744 F 1991 28.0 T 139764 M 1995 40.5 E 118152 M 1983 124.5 ; run; Type the above SAS code into the editor window. To run it, just click on the little running figure in the icon bar along the top. Then look in the log window to see whatrunning figure in the icon bar along the top....
View Full Document
- Spring '08