Date: Thu, 20 Apr 2006 11:26:25 -1000
Reply-To: Bob Schacht <schacht@hawaii.edu>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Bob Schacht <schacht@hawaii.edu>
Subject: Re: Combining categories
In-Reply-To: <5CFEFDB5226CB54CBB4328B9563A12EE045271A6@hqemail2.spss.com>
Content-type: text/plain; charset=us-ascii; format=flowed
At 02:26 AM 4/20/2006, Peck, Jon wrote:
>To go back to the original question, I surmise that what is being asked
>for is a canned transformation that would recode together small category
>values into a new variable, optionally respecting an ordinality property,
>and combining the value labels of the merged categories. It would need to
>work off of an absolute or percentage threshold for the meaning of
>small. If ordinal, I suppose it would merge values into the next or
>previous category while if not ordinal, it might create one "other"
>category with all of these together.
>
>Have I got that right?
Jon,
Thanks for response. Unfortunately you don't have it quite right. Here are
the actual categories:
EDUC_LVL
'0' 'No formal school'
'1' 'Grades 1-8'
'2' 'Grades 9-12 no diploma'
'3' 'Spec Ed Cert/Diploma'
'4' 'H.S. Grad or GED'
'5' 'Post-secondary, no degree'
'6' 'Assoc degree/VocTech Cert'
'7' 'Bachelors Degree'
'8' 'Masters or more'/
It is presently coded as a string variable, and is mostly ordinal, except
that category '3' is really an alternative branch. Some will argue that it
is "the same as" graduating from HS or getting a GED; Others will argue
that it's not even
on par with Grade 12, even without a diploma. Also, if someone has 3 years
of college and then drops out before getting a degree, is that "less than"
someone with a 2-year associate degree? So this variate is approximately
ordinal but not quite; and besides, I've got it presently defined as a
string variable.
My problem was that categories '0', '3', and '8' are relatively rare. '0'
combines logically with '1' with the meaning "less than 8 full years of
school," while '3' combines easily with '4' because all are certificates of
completion at approximately the same level. '6' combines logically either
with '5' or with '7' and '8'.
Richard Ristow suggested that the pooling could be done relatively easily
using recode as follows:
>To make it some easier, use TEMPORARY. If
>
>CROSSTABS ED_LEVL BY ...
> /STATISTICS = CHISQ
>
>produces small cells, try something like this:
>If, say,
>+ high school=4, GED = 5;
>+ Masters=7, other advanced degrees have higher codes;
>then,
>
>TEMPORARY.
>RECODE ED_LEVL
> (5 = 4)
> (7 THRU HI = 7).
>CROSSTABS ED_LEVL BY ...
> /STATISTICS = CHISQ
But doesn't TEMPORARY apply ONLY to the next procedure, which would have
the result that it would apply only to the recode, but then forget the
recode when it does the CROSSTAB?
And also, this assumes that ED_LEVL is a number, but I have it as a string
variable. I can't use the "THRU HI" with a string variable, can I?
Thanks,
Bob
>You could do this with a combination of AGGREGATE and other transformation
>logic, but it is a natural for (surprise) programmability. The spssaux2
>module on the SPSS Code Center (forums.spss.com/code_center) has an
>example of a similar sort of complex calculation. The
>CreateBasisVariables function creates a set of dummy variables
>representing the distinct values of a variable suitable for use in
>Regression etc. I can cook up a similar method for merging small values.
>
>Regards,
>Jon Peck
Robert M. Schacht, Ph.D. <schacht@hawaii.edu>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814
|