| Date: | Fri, 31 Jan 1997 21:40:25 EDT |
| Reply-To: | hermans1@WESTATPO.WESTAT.COM |
| Sender: | "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU> |
| From: | Sig Hermansen <hermans1@WESTATPO.WESTAT.COM> |
|---|
>>>>Question about a method of recoding variables in SAS:
Date: Fri, 31 Jan 1997 14:15:04 EST
From: "Karen L. Olson, Ph.D." <OLSON_K@A1.TCH.HARVARD.EDU>
Subject: Looking for bilingual SPSS/SAS users
I never quite got the hang of recoding variables in SAS and don't often
have to do it. However, now I do. Can someone explain briefly the
SAS approach to recoding. I liked SPSS's recode statement. What is the
equivalent in SAS?
Here's an example of something I need to do, in SPSS syntax:
RECODE uica ueff (4 thru 9=1) (10,3=2) (2=3) (1=4).
How might I do the same in SAS?
Thanks,
Karen Olson
Children's Hospital, Boston
olson_k@a1.tch.harvard.edu
----------------- Response by Sig Hermansen -------------------------
SAS and SPSS take different approaches to the problem of
recoding variables. SPSS changes the value stored and provides a
simple method of doing that. SAS offers a convenient way to
change the format of variables from one coding scheme to another
for purposes of computing statistics, printing values, or
comparing values. It takes some time and effort to get used to
methods of applying formats to variables. Nonetheless, recoding
using formats offers a number of advantages.
Starting with your example,
RECODE uica ueff (4 thru 9=1) (10,3=2) (2=3) (1=4).
has a equivalent format rcode in SAS defined by
proc format;
value rcode 4-9 = '1'
10,3 = '2'
2 = '3'
1 = '4'
;
run;
With this format, you can now compute frequencies of the
recoded variables as
proc freq;
tables uica ueff;
format uica ueff rcode.;
run;
(Note the period in the numeric format name rcode.)
The same format statement will work the same way in a proc
print and some other SAS procedures. You can also assign the
formatted values to new variables by using the put statement, as
in
x=put(uica,rcode.);
for x of type character, or
x=input(put(uica,rcode.),8.)
for x of type numeric. You can also compare with greater
confidence two numeric codes in, say, data sources on different
platforms. The comparison
if put(uica,rcode.) = put(ueff,rcode.)
may take care of differences in the precision in the stored
numbers.
Though a bit less direct, the formatting method makes it easy to
switch from one coding scheme to another, to specify complex
recoding schemes, and to differentiate coded values from the
original values in the data sources. To assist, SAS provides a
number of standard formats (and informats for input operations as
well). All this serves an overriding principle of data
management: avoid when possible changing source data in a way
that will cause a loss of information if it becomes necessary to
restart a program.
|