Chapter 6 Subsetting and Combining Data Sets

Chapter 6 Subsetting and Combining Data Sets - STAT1303...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
STAT1303 Data Management 6. Subsetting and Combining Data Sets 6S u b s e t t i n ga n dC om b i n i n gD a t aS e t s Occasionally, we do not want to use the whole data and want to restrict the data set that are placed into a data set. Especially for a huge data set, we may just want to use some of the available observations in a data set or some of the variables in a data set. In this chapter, we go through some SAS techniques to perform the tasks. 6.1 Restricting Observations 6.1.1 Restricting Observations - INFILE Options When raw data is read in by the INPUT statement in a Data Step, we can restrict observations based on their relative positions in the raw data. The OBS= and FIRSTOBS= options in the INFILE statement do the job. The FIRSTOBS= option speciFes the Frst observation to be read, e.g. FIRSTOBS=5 will skip the Frst 4 records and begin to read from the Ffth record. The OBS= option speciFes the LAST observation to be read, e.g. OBS=15 will stop reading data after the 15 th records. Note that OBS= option refers to an absolute observation number, not a relative number of observations. If we want to read records 5 to 15 inclusively, we may restrict the reading by FIRSTOBS=5 and OBS=15 . Example 6.1. Restrict reading observations between 5 and 15. *Example6.1-FIRSTOBSandOBSi nI N F I L E ; data trade; infile ’D:/temp/trade.dat’ dlm=’,’ firstobs=5 obs=15; length country $56.; input country yr import export reexport; run; NOTE: 11 records were read from the infile ’d:\temp\trade.dat’. The minimum record length was 29. The maximum record length was 45. NOTE: The data set WORK.TRADE has 11 observations and 5 variables. SAS log reveals that there are 11 observations in the data set TRADE although there are 63 records in the raw data originally. How about the following INFILE statements? INFILE datalines firstobs=101; INFILE datalines obs=100; INFILE datalines firstobs=50 obs=40; INFILE datalines obs=0; Obviously, the above INFILE statements are invalid. HKU STAT1303 (2011-12, Semester 1) 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
STAT1303 Data Management 6. Subsetting and Combining Data Sets 6.1.2 Restricting Observations - Data Set Options If data are already in a SAS data set, a SET statement can copy the data to a new data set. To restrict observations in the new data set, FIRSTOBS= and OBS= options can be used as data set options for the original data set. Example 6.2. Restrict the observations by data set options. *Example6.2-FIRSTOBSandOBSi nd a t as e to p t i o n s ; data trade1; set trade (firstobs=6 obs=10); run; In the above example, a new data set TRADE1 is created. This data set contains the observations 6 through 10 of the data set TRADE. On the other hand, the data set option can also be applied to a data set used in a SAS procedure. Example 6.3. Restrict the observations by data set options in a PROC.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/09/2012 for the course STAT 1301 taught by Professor Smslee during the Spring '08 term at HKU.

Page1 / 29

Chapter 6 Subsetting and Combining Data Sets - STAT1303...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online