| Date: | Thu, 3 Apr 2008 15:37:34 -0400 |
| Reply-To: | "Gross, Paul Jacob" <paugross@INDIANA.EDU> |
| Sender: | "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU> |
| From: | "Gross, Paul Jacob" <paugross@INDIANA.EDU> |
| Subject: | Re: Adding missing cases to univariate data file |
|
| In-Reply-To: | <7.0.1.0.2.20080402145649.03e98ed0@mindspring.com> |
| Content-Type: | text/plain; charset="us-ascii" |
Richard & SPSS Folks,
Thanks. This worked like a charm. I also found some code that I adapted (see below) to do the trick. I have another puzzle for you and others if anyone is willing.
For the same data set I would like to insert the missing cases from the first year of data forward to 2005. So, for example with ID#1 below I would want to insert 2003, 2004 and 2005 whereas with ID#3 I would only want to insert 2005. Any thoughts building from the code I used or from the code you sent? I am thinking a macro might work or something that would set the first value in loop equal to the first year in the dataset, but haven't hit on a way to do it.
DATA LIST LIST/ caseid YearOnly var1 var2.
BEGIN DATA
1 2000 2 3
1 2001 5 2
1 2002 8 2
2 2000 2 5
2 2002 3 3
2 2003 8 5
3 2004 9 6
4 2005 0 2.
END DATA.
SORT CASES by CASEID (A) YEARONLY (A).
SAVE OUTFILE= 'C:\temp.sav'.
SORT CASES by CASEID Year.
AGGREGATE
/OUTFILE=*
/BREAK=CASEID
/YearOnly = NU(YearOnly).
LOOP yr = 1999 TO 2005.
+ XSAVE OUTFILE='temp.sav'
/KEEP=CASEID yr .
END LOOP.
EXECUTE.
GET FILE='temp.sav'.
RENAME VARIABLE (yr=YearOnly).
****Matching the Files and filling in the gaps in years.
MATCH FILES /FILE=*
/FILE='C:\temp.sav'
/BY CASEID YearOnly.
EXECUTE.
SAVE OUTFILE= 'C:\temp2.sav'.
________________________________________
From: Richard Ristow [wrristow@mindspring.com]
Sent: Wednesday, April 02, 2008 3:19 PM
To: Gross, Paul Jacob; SPSSX-L@LISTSERV.UGA.EDU
Subject: Re: Adding missing cases to univariate data file
At 01:31 PM 4/2/2008, Jake Gross wrote:
>I am working with longitudinal annual data of the type below.
|-----------------------------|---------------------------|
|Output Created |02-APR-2008 15:56:39 |
|-----------------------------|---------------------------|
[TestInpt]
id year var1 var2
1 2000 2 3
1 2001 5 2
1 2002 8 2
2 2000 2 5
2 2002 3 3
2 2003 8 5
3 2004 9 6
4 2005 0 2
Number of cases read: 8 Number of cases listed: 8
>I would like to restructure the file so that I create a new case for all
>years from 1999-2005, [with missing values for data in the added cases]
Mildly tricky, since you can't use INPUT PROGRAM/LOOP/END CASE when
reading an existing SPSS file (sigh). You could use LOOP/XSAVE to get
an effect similar to END CASE's, and I'd do that if XSAVE could write
to a dataset (second sigh).
Here's a VARSTOCASES solution (code and output not saved separately):
DATASET DECLARE YearList.
AGGREGATE OUTFILE=YearList
/BREAK=ID
/NRECS 'No. of records for ID, but don''t really care' = NU.
DATASET ACTIVATE YearList WINDOW=FRONT.
. /**/ LIST /*-*/.
List
|-----------------------------|---------------------------|
|Output Created |02-APR-2008 16:13:33 |
|-----------------------------|---------------------------|
[YearList]
id NRECS
1 3
2 3
3 1
4 1
Number of cases read: 4 Number of cases listed: 4
NUMERIC YEAR1999 TO YEAR2005 (F4).
DO REPEAT VARIABLE = YEAR1999 TO YEAR2005
/VALUE = 1999 TO 2005.
. COMPUTE VARIABLE = VALUE.
END REPEAT.
. /**/ LIST /*-*/.
List
|-----------------------------|---------------------------|
|Output Created |02-APR-2008 16:13:33 |
|-----------------------------|---------------------------|
[YearList]
id NRECS YEAR1999 YEAR2000 YEAR2001 YEAR2002 YEAR2003 YEAR2004 YEAR2005
1 3 1999 2000 2001 2002 2003 2004 2005
2 3 1999 2000 2001 2002 2003 2004 2005
3 1 1999 2000 2001 2002 2003 2004 2005
4 1 1999 2000 2001 2002 2003 2004 2005
Number of cases read: 4 Number of cases listed: 4
VARSTOCASES
/MAKE YEAR FROM YEAR1999 YEAR2000 YEAR2001 YEAR2002 YEAR2003
YEAR2004 YEAR2005
/KEEP = id
/NULL = KEEP.
Variables to Cases
|-----------------------------|---------------------------|
|Output Created |02-APR-2008 16:13:33 |
|-----------------------------|---------------------------|
[YearList]
Generated Variables
|----|------|
|Name|Label |
|----|------|
|YEAR|<none>|
|----|------|
Processing Statistics
|-------------|-|
|Variables In |9|
|Variables Out|2|
|-------------|-|
. /**/ LIST /*-*/.
List
|-----------------------------|---------------------------|
|Output Created |02-APR-2008 16:13:33 |
|-----------------------------|---------------------------|
[YearList]
id YEAR
[List of cases suppressed]
Number of cases read: 28 Number of cases listed: 28
MATCH FILES
/FILE=YearList
/FILE=TestInpt
/BY ID YEAR.
LIST.
List
|-----------------------------|---------------------------|
|Output Created |02-APR-2008 16:13:34 |
|-----------------------------|---------------------------|
id YEAR var1 var2
1 1999 . .
1 2000 2 3
1 2001 5 2
1 2002 8 2
1 2003 . .
1 2004 . .
1 2005 . .
2 1999 . .
2 2000 2 5
2 2001 . .
2 2002 3 3
2 2003 8 5
2 2004 . .
2 2005 . .
3 1999 . .
3 2000 . .
3 2001 . .
3 2002 . .
3 2003 . .
3 2004 9 6
3 2005 . .
4 1999 . .
4 2000 . .
4 2001 . .
4 2002 . .
4 2003 . .
4 2004 . .
4 2005 0 2
Number of cases read: 28 Number of cases listed: 28
=================================
APPENDIX: Test data, and all code
(Test data from original posting,
and very nicely done)
=================================
DATA LIST LIST /id year var1 var2.
BEGIN DATA
1 2000 2 3
1 2001 5 2
1 2002 8 2
2 2000 2 5
2 2002 3 3
2 2003 8 5
3 2004 9 6
4 2005 0 2
END DATA.
FORMATS id (F2)
year (F4)
var1 var2 (F3).
DATASET NAME TestInpt WINDOW=FRONT.
LIST.
DATASET DECLARE YearList.
AGGREGATE OUTFILE=YearList
/BREAK=ID
/NRECS 'No. of records for ID, but don''t really care' = NU.
DATASET ACTIVATE YearList WINDOW=FRONT.
. /**/ LIST /*-*/.
NUMERIC YEAR1999 TO YEAR2005 (F4).
DO REPEAT VARIABLE = YEAR1999 TO YEAR2005
/VALUE = 1999 TO 2005.
. COMPUTE VARIABLE = VALUE.
END REPEAT.
. /**/ LIST /*-*/.
VARSTOCASES
/MAKE YEAR FROM YEAR1999 YEAR2000 YEAR2001 YEAR2002 YEAR2003
YEAR2004 YEAR2005
/KEEP = id
/NULL = KEEP.
. /**/ LIST /*-*/.
MATCH FILES
/FILE=YearList
/FILE=TestInpt
/BY ID YEAR.
LIST.
=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
|