Date: Wed, 5 Oct 2005 21:37:10 -0400
Reply-To: Richard Ristow <wrristow@mindspring.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Richard Ristow <wrristow@mindspring.com>
Subject: Re: Transform multiple response variable
In-Reply-To: <200510052218.j95L31UQ024401@malibu.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"; format=flowed
At 06:18 PM 10/5/2005, SAM wrote:
>I have 10 categorical variables [which] correspond to 10 questionnaire
>items phrased as: list the top 10 actors of all time.
>
>Variable 1 records {all the names offered by the subjects for the
>greatest actor}
No, and the distinction is important: it records each subject's choice
for the greatest actor. That is: it's not an aggregate; it's an
individual value in each record, which together make up the aggregate.
>variable 2 [records the subject's choice] for the second greatest,
>etc. Each variable [has] about 200 categories (unique
>names/responses). I want to create a summary variable for the top 20
>names (categories), such that if, say, respondent x named Cary Grant
>as the greaest actor of all time (recorded in v1), I want to store in
>a new CaryGrant variable a score of 10. If Cary Grant was named in v2,
>the value should be, in the same CaryGrant variable, 9, v3-8, v4-7,
>etc, to v10-1.
No good. You're talking about 200 different variables, one for each
actor. It's much, much easier to 'unroll' your data, to one record per
subject per response. I'm not testing the code, but like this.
If your data has 11 variables, SUBJECT (identifier for the respondent)
and CHOICE01 to CHOICE10, you'll write a new, 'unrolled' file with
variables
SUBJECT - as before
ACTOR - The actor named
RANK - Where the subject ranked the actor, 1-10
SCORE - Score for the ranking: 11-RANK.
>Obviously, no actor can be recorded more than once
(Wryly) Far from obvious. None should be, but having something be
logically contradictory, doesn't mean it can't happen. The 'unroll'
logic will ignore this and write records for duplicates, so in theory a
subject could have Cary Grant a total score of 55 by naming him in all
ten places. You can put in error checking for this, if you like.
Anyway, for each record that names 10 actors, you'll write 10 records
that each name one actor. I'm not testing the code, but like this. (I
assume you can write scratch files freely to c:\TMP.)
/* Create the UNROLLED file, with variables */
/* SUBJECT ACTOR RANK SCORE */
/* (See XSAVE statement) */
VECTOR ACTORS=CHOICE01 TO CHOICE10.
NUMERIC RANK (F2)
/ACTOR (F4) /* same format as CHOICE01,...*/
/SCORE (F2).
LOOP RANK = 1 TO 10.
. DO IF NOT MISSING(ACTORS(RANK)).
. COMPUTE ACTOR = ACTORS(RANK).
. COMPUTE SCORE = 11 - RANK.
. XSAVE OUTFILE='c:\TMP\UNROLLED.SAV'
/KEEP = SUBJECT ACTOR RANK SCORE.
. END IF.
END LOOP.
EXECUTE.
/* Load the unrolled file */
GET FILE='c:\TMP\UNROLLED.SAV'.
/* Get the total score for each actor */
AGGREGATE OUTFILE=*
/BREAK=ACTOR
/MENTION 'Number of times mentioned' = N
/TOTAL 'Total score, from rankings' = SUM(SCORE).
/* Sort, from highest to lowest score */
SORT CASES BY TOTAL(D) ACTOR (A).
/* List the top 20 */
LIST ACTOR TOTAL
/CASES TO 20.
/