Let us say that we want to concatenate two datasets

This preview shows page 113 - 115 out of 150 pages.

Let us say that we want to concatenate two datasets, but this time, instead of the first dataset followedby the second, they are sorted by a variable. The programDATAsetexample3; SET second third; BY name; RUN; creates a new dataset which is the same assetexample, this time, the new dataset is sorted by thevariablename. So instead of Lucky (1stobs 2nddataset) following Vangie (Last obs, 1stdataset), Lucky follows Igor.Note that the BY variable should be sorted for each dataset, otherwise, the data step will yield an error.Now, suppose there is another variable for each of the observations insetexample3. That is x3 = 1(exercising) or 0 (no exercise) the following program creates this dataset:DATAfourth; INPUT x3; DATALINES; 1 0 0 1 1 1 ; RUN; The following program creates a dataset which merges datasetssetexampleandfourth:DATAmergeexample; MERGE setexample fourth; RUN; PROCPRINTDATA = mergeexample noobs; RUN;Of course, the observations in the first dataset should match the observations in the second dataset. Somake sure that the new dataset makes sense.There is a new option in the PROC PRINT command, which isnoobs, which do not print the columnObsinthe output table.Choosing and Renaming Variables: KEEP, DROP and RENAME StatementsIn summary, these statements do the following tasksKEEP:creates a dataset from an existing one, keeping only the specifiedvariables.DROP:creates a dataset from an existing one, dropping all the specifiedvariables.RENAME:renames an existing variable of the new dataset.The program below creates a new datasetkdrexamplefrom the existingmergeexample, but keeping onlythe variablename:
Course Notes in Statistical Softwares 107|D a q u i sDATAkdrexample; SET mergeexample; KEEP name; RUN; PROCPRINTnoobs; RUN; Note that the “data = kdrexample” is omitted in the PROC PRINT step. It means that if the dataset name isomitted in a PROC step, then SAS processes the most recent dataset being created.In contrast, the DROP variable keeps only the unspecified variables:DATAkdrexample2; SET mergeexample; DROP x1; RUN; PROCPRINT; RUN;This time, notice that the two lines in the PROC step is collapsed into one. It is absolutely okay, as long asthere is a semicolon on where it is supposed to be. You can even write the DATA step in a single line, again, thereshould be a semicolon at the end of each statement.You can also have the DROP and KEEP statements as options in the DATA statement. For example,DATAkdrexample3 (KEEP = name) kdrexample4 (DROP = x1); SET mergeexample; RUN;creates two datasets frommergeexamplewhereinkdrexample3 = kdrexample1andkdrexample4 =kdrexample2.It is also possible to rename variables from an existing dataset in the newly created dataset. For example,DATAkdrexample5; SET mergeexample; KEEP name y; RENAME y = weight; RUN; ThisDATAstepcreatesanewdatasetkdrexample5whichkeepsvariablesnameandyfrommergeexampledataset, and in the new datasetyis renamed asweight. Here the variableweighttakes the valuesofy.

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture