LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 3 Jan 2008 22:20:32 -0500
Reply-To:     Richard Ristow <wrristow@mindspring.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Richard Ristow <wrristow@mindspring.com>
Subject:      Re: Fixing multiple records for same time interval...
Comments: To: rich reeves <statscodejunky@gmail.com>
In-Reply-To:  <ffa35560712190700o2111ed0fv1fa7768ef8798c58@mail.gmail.com >
Content-Type: text/plain; charset="us-ascii"; format=flowed

A programming exercise that got put off, in hectic holiday travel.

At 10:00 AM 12/19/2007, rich reeves wrote:

>I have a data set where individuals have multiple rows with interval >information. What I have looks like:

|-----------------------------|---------------------------| |Output Created |03-JAN-2008 21:58:42 | |-----------------------------|---------------------------| id time_begin time_end

a 08/15/2006 12/15/2006 a 08/01/2006 12/11/2006 a 08/31/2006 11/20/2006 a 09/15/2006 12/31/2006 a 01/15/2007 05/01/2007 a 05/31/2007 08/01/2007 a 06/15/2007 08/31/2007

Number of cases read: 7 Number of cases listed: 7 (This is reformatted to what you probably actually have. Your example came through with many line-breaks you probably didn't intend; I wonder why this happens so often?)

>I want to create a maximum interval for each set of overlapping >intervals for an individual. That is for a set of rows with >overlapping time intervals I want the minimum begin date and the >maximum end date:

id time_begin time_end a 8/1/2006 12/31/2006 a 1/15/2007 5/1/2007 a 5/31/2007 8/31/2007

That is, you want one record for every interval in which all days fell within the interval for some one of the records, by ID. Here's a solution using LAG and AGGREGATE. Variable #LATEST is a scratch variable (see the SPSS *Command Syntax Reference*), which means that (a) it keeps its value from one case to the next (b) it isn't kept in the file. SPSS 14 draft output (WRR:not saved separately):

* Assign each record to an interval of . * continuous 'activity': .

SORT CASES BY time_begin. NUMERIC INTERVAL (F3). NUMERIC #LATEST (ADATE10).

DO IF $CASENUM EQ 1 /* First record in the file */. . COMPUTE INTERVAL = 1. . COMPUTE #LATEST = time_end. ELSE IF id NE LAG(ID) /* First record for an id */. . COMPUTE INTERVAL = 1. . COMPUTE #LATEST = time_end. ELSE IF time_begin LE #LATEST /* Continue current interval*/. . COMPUTE INTERVAL = LAG(INTERVAL). . COMPUTE #LATEST = MAX(#LATEST,time_end). ELSE /* Start new interval */. . COMPUTE INTERVAL = LAG(INTERVAL) + 1. . COMPUTE #LATEST = time_end. END IF.

. /**/ LIST /*-*/.

List |-----------------------------|---------------------------| |Output Created |03-JAN-2008 22:16:26 | |-----------------------------|---------------------------| id time_begin time_end INTERVAL

a 08/01/2006 12/11/2006 1 a 08/15/2006 12/15/2006 1 a 08/31/2006 11/20/2006 1 a 09/15/2006 12/31/2006 1 a 01/15/2007 05/01/2007 2 a 05/31/2007 08/01/2007 3 a 06/15/2007 08/31/2007 3

Number of cases read: 7 Number of cases listed: 7

* Get earliest and latest dates for . * each interval: .

AGGREGATE OUTFILE=* /BREAK = ID INTERVAL /time_begin = MIN(time_begin) /time_end = MAX(time_end).

LIST.

List |-----------------------------|---------------------------| |Output Created |03-JAN-2008 22:16:27 | |-----------------------------|---------------------------| id INTERVAL time_begin time_end

a 1 08/01/2006 12/31/2006 a 2 01/15/2007 05/01/2007 a 3 05/31/2007 08/31/2007

Number of cases read: 3 Number of cases listed: 3

=================== APPENDIX: Test data =================== DATA LIST LIST/ id time_begin time_end (A1 ADATE10 ADATE10). BEGIN DATA a 8/15/2006 12/15/2006 a 8/1/2006 12/11/2006 a 8/31/2006 11/20/2006 a 9/15/2006 12/31/2006 a 1/15/2007 5/1/2007 a 5/31/2007 8/1/2007 a 6/15/2007 8/31/2007 END DATA. LIST.

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page