Date: Fri, 16 Sep 2005 16:30:44 -0400
Reply-To: Richard Ristow <wrristow@mindspring.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Richard Ristow <wrristow@mindspring.com>
Subject: Re: Reformatting data
In-Reply-To: <8AE883D86DBAAF4E9924E984A06927F24DEFF4@mcintire6.comm.virg
inia.edu>
Content-Type: text/plain; charset="us-ascii"; format=flowed
At 11:54 AM 9/16/2005, Baglioni, Tony wrote:
>I've been using SPSS for too many years but I've never needed to do a
>major reformatting of my data (just lucky, I reckon). I have eleven
>years of observations on about fifteen variables. Unfortunately, the
>data are stacked on separate lines rather than strung out on a single
>line. I know some combinations of macros and do repeats can reformat
>the data but I cannot find any examples on how to accomplish this on
>Raynald's site. Can someone point me to a template I can use?
Whoof! The first, and probably best, advice is, don't do it. You have
what's called 'long' data organization: when there's repeating data per
(logical) case or subject, a separate record (SPSS says "case", which
is confusing) for each repetition. What you're looking for is called
'wide' organization (formerly, 'flattened'): all repetitions in the
same record.
'Long' data is easier to handle. ANYTHING with fewer variables, is
easier. To start with, a lot of logic will involve a loop through the
set of repetitions. In 'long' organization, you can use the loop that's
already implied by the transformation program; in 'wide' organization,
you need separate loops.
At least, think VERY hard what you want to do, and why you think 'wide'
is better. I'd encourage you to post that; we may be able to suggest
alternatives.
>I know some combinations of macros and do repeats can reformat the
>data
Sorry, but another "No": Don't look to macros to do a job. Whatever can
be done with macros can be done without them; all they do is generate
SPSS syntax. The macro is another, and confusing, layer of code to
debug.
HOWEVER, here is an example, using DO REPEAT and AGGREGATE. (It's
labeled "Method I" because I was considering doing it with VECTOR as
well; but didn't.) It's tested; this is SPSS draft output. It handles
missing 'year' records correctly.
* ----------------------------------------------- .
* Original data, in 'long' form .
GET FILE=LONGDATA.
LIST.
List
Notes
|---------------------------|-------------------|
|Output Created |16 Sep 05 16:26:38 |
|---------------------------|-------------------|
SUBJECT YEAR ALPH BETA GAMM
001 2001 8 16 20
001 2002 9 23 14
023 2001 21 8 23
023 2003 7 24 10
456 2001 20 29 11
456 2003 9 7 22
007 2001 27 14 17
007 2002 10 25 6
007 2003 22 29 13
089 2001 24 8 19
089 2002 13 18 18
Number of cases read: 11 Number of cases listed: 11
* Method I: DO REPEAT .
NUMERIC
ALPH2001 BETA2001 GAMM2001
ALPH2002 BETA2002 GAMM2002
ALPH2003 BETA2003 GAMM2003 (F3).
DO REPEAT TESTYR = 2001 2002 2003
/N_ALPHA = ALPH2001 ALPH2002 ALPH2003
/N_BETA = BETA2001 BETA2002 BETA2003
/N_GAMMA = GAMM2001 GAMM2002 GAMM2003.
. DO IF YEAR EQ TESTYR.
. COMPUTE N_ALPHA = ALPH.
. COMPUTE N_BETA = BETA.
. COMPUTE N_GAMMA = GAMM.
. END IF.
END REPEAT.
. /**/ STRING SPACER(A19).
. /**/ LIST.
List
Notes
|---------------------------|-------------------|
|Output Created |16 Sep 05 16:26:38 |
|---------------------------|-------------------|
A B G A B G A B G
S L E A L E A L E A
U P T M P T M P T M
B H A M H A M H A M
J A B G 2 2 2 2 2 2 2 2 2
E L E A 0 0 0 0 0 0 0 0 0
C P T M 0 0 0 0 0 0 0 0 0
T YEAR H A M 1 1 1 2 2 2 3 3 3 SPACER
001 2001 8 16 20 8 16 20 . . . . . .
001 2002 9 23 14 . . . 9 23 14 . . .
023 2001 21 8 23 21 8 23 . . . . . .
023 2003 7 24 10 . . . . . . 7 24 10
456 2001 20 29 11 20 29 11 . . . . . .
456 2003 9 7 22 . . . . . . 9 7 22
007 2001 27 14 17 27 14 17 . . . . . .
007 2002 10 25 6 . . . 10 25 6 . . .
007 2003 22 29 13 . . . . . . 22 29 13
089 2001 24 8 19 24 8 19 . . . . . .
089 2002 13 18 18 . . . 13 18 18 . . .
Number of cases read: 11 Number of cases listed: 11
AGGREGATE OUTFILE=*
/BREAK = SUBJECT
/ALPH2001 BETA2001 GAMM2001
ALPH2002 BETA2002 GAMM2002
ALPH2003 BETA2003 GAMM2003
= FIRST(ALPH2001 TO GAMM2003).
STRING SPACER (A39).
LIST.
List
Notes
|---------------------------|-------------------|
|Output Created |16 Sep 05 16:26:40 |
|---------------------------|-------------------|
A B G A B G A B G
S L E A L E A L E A
U P T M P T M P T M
B H A M H A M H A M
J 2 2 2 2 2 2 2 2 2
E 0 0 0 0 0 0 0 0 0
C 0 0 0 0 0 0 0 0 0
T 1 1 1 2 2 2 3 3 3 SPACER
001 8 16 20 9 23 14 . . .
007 27 14 17 10 25 6 22 29 13
023 21 8 23 . . . 7 24 10
089 24 8 19 13 18 18 . . .
456 20 29 11 . . . 9 7 22
Number of cases read: 5 Number of cases listed: 5