Date: Wed, 30 Apr 1997 19:23:54 GMT
Reply-To: David Nichols <nichols@SPSS.COM>
Sender: "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From: David Nichols <nichols@SPSS.COM>
Organization: SPSS, Inc.
Subject: Re: Univariate to multivariate record
In article <SPSSX-L%97042813243846@uga.cc.uga.edu>,
Dick Campbell <DCAMP@UIC.EDU> wrote:
>I am sure that this question has been answered before, but I don't know
>how to get to this list's archives. In any case, here is the question. If
>someone can tell me where to find an answer I would appreciate it.
>
>I have data set up for conventional univariate repeated measures analysis,
with
>each observation on a subject on a separate record. Thus, the record layout
is:
>
>ID OBS X1 X2 ..... Xk
> 1 1 ............
> 1 2 ............
> 1 3 ............
> 2 1 ............
> 2 2 ............
> 2 3 ............
> . . ............
> . . ............
> N T ............
>
>I want to change this to a multivariate record such that the data are strung
>out on the same record as follows
>
>ID Time 1 Data Time 2 Data Time 3 Data
> 1 X11.....X1k X21......X2k X31......X3k
>
>The data are currently in an SPSS system file, so I can't use record manipula-
>tions that require a data list statement. I could write the file in ASCII and
>re-read it, of course, but that seems like the long way around.
>
>I got the task accomplished by writing the three univariate records, one for
>each time, into three separate files, renaming the variables and merging the
>files.
>
>Is there some more efficient way to do this? I thought there might be a
>macro on the SPSS web site for this, but could not find one.
>
>-------------------------------------------------------------------------
>Richard T. Campbell | E-Mail : DCAMP@UIC.EDU
>Department of Sociology M/C 312 | Phone (W): 312/413-3759
>University of Illinois at Chicago | Phone (H): 708/386-2263
>1007 W. Harrison St. | FAX (SOC): 312/996-5104
>Chicago, IL USA 60607-7140 | FAX (PRC): 312/996-2703
>-------------------------------------------------------------------------
Here's a technical note that David Marso wrote on how to do this (people,
please note that I'm posting what someone else wrote, and don't send me
questions on the fine points of what's being done here). I assume that
this would have to be generalized somewhat to handle three time points.
My own brute force way of doing this has generally been to LIST the data
and read it back in with a DATA LIST.
Q. I have data from several subjects on two time points with upwards
to 700 variables. The two time points are on alternating records in one
file with a common ID variable called ID. I need to transform the two
records into a single record with appropriate renamed variables. Is it
possible to automate this process so that I don't need to rename the 700
variables by typing all of the names? .
A. The main key to this solution is to use FLIP to create a variable
(CASE_LBL) which contains the original variable names. A set of RENAME
VARIABLES commands is created using the WRITE command to generate an SPSS
program file which is invoked with an INCLUDE command on the appropriate
subset of cases. Each group of cases is processed and the results merged.
In a final step the resulting variables are permuted so that the original
variables are interleaved as var001_1 var001_2 var002_1 var002_2 etc.
This approach will work provided that all variables have a unique first
six characters. If the variables do not have a unique first six characters
the program will fail. No further suggestions are available aside from
using an initial RENAME VARIABLES command to resolve the conflict.
* Assume the file is active and there is a variable called ID which is
common to two rows of data and a variable called SESSION which tells
which of the two sessions the row of data corresponds *.
* Back up data file *.
SAVE OUTFILE 'RAW'.
* We keep only one case for FLIP *.
SELECT IF $CASENUM=1.
* FLIP allows access to variable names via a new variable called CASE_LBL *.
FLIP.
* Get rid of ID since we do not wish to rename it *.
SELECT IF CASE_LBL <> 'ID'.
* Prepare two include files which contain RENAME VARIABLES commands *.
STRING @1 (A8) @2 (A8).
COMPUTE @1=CONCAT(RTRIM(SUBSTR(CASE_LBL,1,6)),'_1').
COMPUTE @2=CONCAT(RTRIM(SUBSTR(CASE_LBL,1,6)),'_2').
WRITE OUTFILE '@1.INC' /'RENAME VARIABLES (',CASE_LBL,'=',@1,')'.
WRITE OUTFILE '@2.INC' /'RENAME VARIABLES (',CASE_LBL,'=',@2,')'.
EXECUTE.
* Process first file *.
GET FILE 'RAW'.
SELECT IF SESSION=1.
INCLUDE '@1.INC'.
SAVE OUTFILE 'S1'.
* Process second file *.
GET FILE 'RAW'.
SELECT IF SESSION=2.
INCLUDE '@2.INC'.
* Merge the two files and SAVE *.
MATCH FILES FILE 'S1' / FILE * / BY ID.
SAVE OUTFILE 'RAW'.
* Use a similar process as above to alternate the variable names *.
* Other heuristics can be applied here but will be left to the user *.
SELECT IF $CASENUM=1.
FLIP.
SORT CASES BY CASE_LBL.
FLIP.
* This line causes the combined file ('RAW') to become permuted *.
ADD FILES FILE * / IN=TAG / FILE 'RAW' /IN=RAW.
SELECT IF RAW.
* Get rid of extraneous variables *.
MATCH FILES FILE * / KEEP ID ALL / DROP TAG RAW.
--
-----------------------------------------------------------------------------
David Nichols Senior Support Statistician SPSS, Inc.
Phone: (312) 329-3684 Internet: nichols@spss.com Fax: (312) 329-3668
-----------------------------------------------------------------------------