LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 20 Apr 2006 16:56:57 -0500
Reply-To:     "Peck, Jon" <peck@spss.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         "Peck, Jon" <peck@spss.com>
Subject:      Re: Combining categories
Comments: To: Bob Schacht <schacht@hawaii.edu>
Content-Type: text/plain; charset="UTF-8"

It is true that RECODE ranges only apply to numeric variables, but you could create a new variable and recode the exception cases into it in the standard way. That is, recode the copied variable into itself (or use the ELSE = COPY construct). Just use the DELETE VARIABLE command when you no longer want the new variable, which would not be temporary.

What I was thinking was that a general facility to collapse values based on rarity of values into nearby values or a separate category would be a nice tool, but, of course, the recode would not incorporate the sort of logic you are describing.

Regards, Jon Peck

-----Original Message----- From: Bob Schacht [mailto:schacht@hawaii.edu] Sent: Thursday, April 20, 2006 4:26 PM To: Peck, Jon; SPSSX-L@LISTSERV.UGA.EDU Subject: Re: Combining categories

At 02:26 AM 4/20/2006, Peck, Jon wrote: >To go back to the original question, I surmise that what is being asked >for is a canned transformation that would recode together small category >values into a new variable, optionally respecting an ordinality property, >and combining the value labels of the merged categories. It would need to >work off of an absolute or percentage threshold for the meaning of >small. If ordinal, I suppose it would merge values into the next or >previous category while if not ordinal, it might create one "other" >category with all of these together. > >Have I got that right?

Jon, Thanks for response. Unfortunately you don't have it quite right. Here are the actual categories:

EDUC_LVL '0' 'No formal school' '1' 'Grades 1-8' '2' 'Grades 9-12 no diploma' '3' 'Spec Ed Cert/Diploma' '4' 'H.S. Grad or GED' '5' 'Post-secondary, no degree' '6' 'Assoc degree/VocTech Cert' '7' 'Bachelors Degree' '8' 'Masters or more'/

It is presently coded as a string variable, and is mostly ordinal, except that category '3' is really an alternative branch. Some will argue that it is "the same as" graduating from HS or getting a GED; Others will argue that it's not even on par with Grade 12, even without a diploma. Also, if someone has 3 years of college and then drops out before getting a degree, is that "less than" someone with a 2-year associate degree? So this variate is approximately ordinal but not quite; and besides, I've got it presently defined as a string variable.

My problem was that categories '0', '3', and '8' are relatively rare. '0' combines logically with '1' with the meaning "less than 8 full years of school," while '3' combines easily with '4' because all are certificates of completion at approximately the same level. '6' combines logically either with '5' or with '7' and '8'.

Richard Ristow suggested that the pooling could be done relatively easily using recode as follows: >To make it some easier, use TEMPORARY. If > >CROSSTABS ED_LEVL BY ... > /STATISTICS = CHISQ > >produces small cells, try something like this: >If, say, >+ high school=4, GED = 5; >+ Masters=7, other advanced degrees have higher codes; >then, > >TEMPORARY. >RECODE ED_LEVL > (5 = 4) > (7 THRU HI = 7). >CROSSTABS ED_LEVL BY ... > /STATISTICS = CHISQ

But doesn't TEMPORARY apply ONLY to the next procedure, which would have the result that it would apply only to the recode, but then forget the recode when it does the CROSSTAB?

And also, this assumes that ED_LEVL is a number, but I have it as a string variable. I can't use the "THRU HI" with a string variable, can I?

Thanks, Bob

>You could do this with a combination of AGGREGATE and other transformation >logic, but it is a natural for (surprise) programmability. The spssaux2 >module on the SPSS Code Center (forums.spss.com/code_center) has an >example of a similar sort of complex calculation. The >CreateBasisVariables function creates a set of dummy variables >representing the distinct values of a variable suitable for use in >Regression etc. I can cook up a similar method for merging small values. > >Regards, >Jon Peck

Robert M. Schacht, Ph.D. <schacht@hawaii.edu> Pacific Basin Rehabilitation Research & Training Center 1268 Young Street, Suite #204 Research Center, University of Hawaii Honolulu, HI 96814


Back to: Top of message | Previous page | Main SPSSX-L page