LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2003)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 7 May 2003 15:19:42 -0400
Reply-To:     Richard Ristow <wrristow@mindspring.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Richard Ristow <wrristow@mindspring.com>
Subject:      RECODE techniques
In-Reply-To:  <seb8f262.039@gwmail.harthosp.org>
Content-Type: text/plain; charset="us-ascii"; format=flowed

At 11:47 AM 5/7/2003 -0400, Sheryl Horowitz wrote, in posting "Re: SPSS syntax for SF-12":

>This is not very elegant but it is a direct translation from the SAS program.

I'm taking the code as a springboard to comment on some techniques for using RECODE. This is NOT a comment on using the code that Sheryl Horowitz posted. There are very good reasons to run code that works, but has modest deficiencies, rather than doing major rewriting and debugging for an improvement that may not actually benefit you much.

>* Cleaning /Reverse scoring. >**********************************************. >RECODE > RP2 RP3 RE2 RE3 (1=1) (2=2) (3 thru Highest=SYSMIS) (Lowest thru > .9999=SYSMIS) . >EXECUTE . > >RECODE > PF02 PF04 (1=1) (2=2) (3=3) (Lowest thru .99 =SYSMIS) (3.01 >thru > Highest=SYSMIS) . >EXECUTE .

First off, it is *NOT* a good idea to put an EXECUTE after each RECODE. Every EXECUTE causes the whole data file to be read again, and that's one of the slower operations you encounter. If you just put in the RECODEs, they'll all be performed with *one* reading of the data file, and that can be enough faster to notice a lot.

Second, if you're trying to recode a range but leave out an endpoint, as in "(Lowest thru .9999=SYSMIS)", using a value like .9999 to mean "the smallest number less than 1" is inelegant, and can cause errors -- what if .99995 occurs? or .99999999987?

RECODE clauses are taken in order written, so you can use the whole range even if its endpoint has been recoded before. In the statement

>RECODE > RP2 RP3 RE2 RE3 (1=1) (2=2) (3 thru Highest=SYSMIS) > (Lowest thru .9999=SYSMIS) .

the apparent intent is to keep values 1 and 2 as valid, and make most others system-missing. Since 1 has already been recoded (to itself, but that doesn't matter), you can make all smaller numbers missing by clause "(Lowest thru 1 = SYSMIS).

That still leaves fractional values between 1 and 3. If you want to keep values 1 and 2, and eliminate ALL others, the best form is

RECODE RP2 RP3 RE2 RE3 (1=1) (2=2) (ELSE = SYSMIS).

This also comes up in range recodings, for example for age. It's common, having calculated age as years and fractions, to write something like

RECODE AGE (1 THRU 14.9 = 1) (15 THRU 19.9 = 2) [etc.] INTO AGE_RNG.

However, you can recode with no gaps, while including the low point of each range within the range, by writing

RECODE AGE (65 THRU HI = 9) (50 THRU 65 = 8) (40 THRU 50 = 7) [etc.] INTO AGE_RNG.


Back to: Top of message | Previous page | Main SPSSX-L page