| Date: | Thu, 20 May 2004 19:58:04 -0400 |
| Reply-To: | Richard Ristow <wrristow@mindspring.com> |
| Sender: | "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU> |
| From: | Richard Ristow <wrristow@mindspring.com> |
| Subject: | Re: Restructuring data |
|
| In-Reply-To: | <000701c43e9d$a48bcef0$40aad80a@wilkeodense.local> |
| Content-Type: | text/plain; charset="us-ascii"; format=flowed |
|---|
At 03:07 PM 5/20/2004, Henrik Clausen wrote:
>I have some data in the following form:
>
> v1 v2 v3 v4 v5 v6
>1 x1 y1 z1 x2 y2 z2
>2 x3 y3 z3 x4 y4 z4
>
>I would like the data to be restructured into the following way:
>
> v1 v2 v3
>1 x1 y1 z1
>2 x2 y2 z2
>3 x3 y3 z3
>4 x4 y4 z4
So, you have what are two 'records' in one, and want to separate them.
Rebecca Hetter's suggestion will work fine, though if you had a lot of
records to split from one (instead of only two) it would get awkward.
Another standard one uses LOOP and XSAVE. This is from my posting "Re:
collect cases", Wed, 5 May 2004 22:26:32 -0400, unmodified -- that
instance had 250 records to split from each one. Note the stricture:
the original pairs were probably combined for a reason; you don't want
to lose which record was in which pair, or which place in its pair, in
splitting them.
>>I have a data set like this;
>>
>>a1 u1 o1 a2 u2 o2 . . . a250 u250 o250
>>
>>30 45 45 56 58 45 . . . 21 52 34
>>40 34 54 23 52 34 . . . 25 63 45
>>50 56 49 38 25 65 . . . 52 49 67
>>
>>I wish to collect all cases within all "a" variables under varA, "u"
>>variables under varU, and "o" variables under varO.
>
>So, it sounds like you want something like this:
>varA varU varO
> 30 45 45
> 56 58 45
>etc.
>
>That's a good idea a lot of times, but a VERY BAD one unless you have
>an identifying variable in the file, to indicate which original case
>each record comes from. AND include the index (1-250) of the triplet
>of data values.
>
>You can do what you want with CASESTOVARS, I think, but for fun,
>here's older logic, using VECTOR/LOOP/XSAVE. The original case number
>is the global ID, and the index is the secondary ID. Code is NOT TESTED.
>
>NUMERIC CASE_ID (F5)
> /VAR_IDX (F3)
> /VARA VARU VARO (F2).
>VARIABLE LABELS
> CASE_ID 'Original case identifier: $CASENUM from input file'
> VAR_IDX 'Position of current set of data, within original case'
> VarA 'Value of A-variable, for current case and index'
> VarU 'Value of U-variable, for current case and index'
> VarO 'Value of O-variable, for current case and index'.
>
>VECTOR DATA_VAL=A1 TO O250.
>COMPUTE CASE_ID = $CASENUM.
>LOOP VAR_IDX = 1 TO 250.
>. COMPUTE #BASE = 3*(VAR_IDX - 1).
>. COMPUTE #SUBSCR = #BASE + 1.
>. COMPUTE VarA = DATA_VAL(#SUBSCR).
>. COMPUTE #SUBSCR = #BASE + 2.
>. COMPUTE VarU = DATA_VAL(#SUBSCR).
>. COMPUTE #SUBSCR = #BASE + 3.
>. COMPUTE VarO = DATA_VAL(#SUBSCR).
>. XSAVE OUTFILE='c:\MY_SPSS\COLLECT.SAV'
> /KEEP= CASE_ID VAR_IDX VarA VarU VarO.
>END LOOP.
>EXECUTE /* Here's a case where EXECUTE is needed */.
|