Chapter 8 Rearraging Data

Chapter 8 Rearraging Data - STAT1303 Data Management 8....

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
STAT1303 Data Management 8. Rearranging Data 8 Rearranging Data In Chapter 6, we have illustrated how to combine the SAS data sets which may be created by SAS PROCs or user themselves. However, thesetechn iquesare not su f cient for data analysis. In this chapter, we expand thetechn iquesusedin Chapter 7, e.g. we expand the observations of a SAS data set into several data sets. To arrange relevant information within a single source data set, we may come across certain situations: 1. A single observation in a source data set forms multiple observations in the destination data set. 2. Multiple observations in a source data set form a single observation in the destination data set. 3. Convert variables into observations or observations intovar iab les . 4. Split a single observation into many (one to many) by using the techniques of array and DO-loop. 5. Combine multiple observations into a single observation (many to one) by the RETAIN statement and automatic variables FIRST . variable and LAST . variable . Before the introduction of various rearranging techniques of data sets, two important techniques, looping and array, should be studied. 8.1 Converting Variables into Observations Suppose we have a data set of observations that contain multiple occurrences of a medical diagnosis. Each observation represents one person and a person may have multiple medical diagnoses. There are altogether over 500 di F erent medical diagnosis (code 001 – 500). We have two tasks in this example (1) Create a table that shows how often each diagnosis occurs and (2) Create a list of patients who have both of 2spec iFcd iagnoses . Example 8.1. Create a SAS data set DIAGS from the raw data Fle each diagnosis is represented by variables DX1,DX2, ... ,DX5. *Example8.1-readindata;
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
STAT1303 Data Management 8. Rearranging Data 001 328 138 412 002 116 440 082 368 003 153 428 442 340 004 359 146 410 299 005 428 442 006 092 488 162 210 086 007 308 113 008 142 158 403 009 074 ... Note that to read in the variables DX1, DX2, DX3, DX4, DX5, we have used the abbreviation DX1-DX5 which indicates DX1 to DX5. This notation is useful to simplify a list of variables. The Frst task is to create a table that shows how often each diagnosis occurs within the data set. The di f culty is that the diagnoses are stored in 5 di F erent variables in the data set. To create the frequency table for diagnosis, there are three methods: METHOD 1. The Frst method is to perform PROC FREQ on each of the 5 diagnoses and combine the results of the 5 tables. Example 8.2.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/09/2012 for the course STAT 1301 taught by Professor Smslee during the Spring '08 term at HKU.

Page1 / 10

Chapter 8 Rearraging Data - STAT1303 Data Management 8....

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online