Date: Fri, 31 Jan 1997 21:40:25 EDT hermans1@WESTATPO.WESTAT.COM "SAS(r) Discussion" Sig Hermansen

>>>>Question about a method of recoding variables in SAS:

Date: Fri, 31 Jan 1997 14:15:04 EST From: "Karen L. Olson, Ph.D." <OLSON_K@A1.TCH.HARVARD.EDU> Subject: Looking for bilingual SPSS/SAS users

I never quite got the hang of recoding variables in SAS and don't often have to do it. However, now I do. Can someone explain briefly the SAS approach to recoding. I liked SPSS's recode statement. What is the equivalent in SAS?

Here's an example of something I need to do, in SPSS syntax:

RECODE uica ueff (4 thru 9=1) (10,3=2) (2=3) (1=4).

How might I do the same in SAS?

Thanks,

Karen Olson Children's Hospital, Boston olson_k@a1.tch.harvard.edu

----------------- Response by Sig Hermansen -------------------------

SAS and SPSS take different approaches to the problem of recoding variables. SPSS changes the value stored and provides a simple method of doing that. SAS offers a convenient way to change the format of variables from one coding scheme to another for purposes of computing statistics, printing values, or comparing values. It takes some time and effort to get used to methods of applying formats to variables. Nonetheless, recoding using formats offers a number of advantages.

Starting with your example,

RECODE uica ueff (4 thru 9=1) (10,3=2) (2=3) (1=4).

has a equivalent format rcode in SAS defined by

proc format; value rcode 4-9 = '1' 10,3 = '2' 2 = '3' 1 = '4' ; run;

With this format, you can now compute frequencies of the recoded variables as

proc freq; tables uica ueff; format uica ueff rcode.; run;

(Note the period in the numeric format name rcode.)

The same format statement will work the same way in a proc print and some other SAS procedures. You can also assign the formatted values to new variables by using the put statement, as in x=put(uica,rcode.);

for x of type character, or

x=input(put(uica,rcode.),8.)

for x of type numeric. You can also compare with greater confidence two numeric codes in, say, data sources on different platforms. The comparison

if put(uica,rcode.) = put(ueff,rcode.)

may take care of differences in the precision in the stored numbers.

Though a bit less direct, the formatting method makes it easy to switch from one coding scheme to another, to specify complex recoding schemes, and to differentiate coded values from the original values in the data sources. To assist, SAS provides a number of standard formats (and informats for input operations as well). All this serves an overriding principle of data management: avoid when possible changing source data in a way that will cause a loss of information if it becomes necessary to restart a program.

Back to: Top of message | Previous page | Main SAS-L page