|Date: ||Mon, 28 Jul 2003 17:29:47 -0400|
|Reply-To: ||Richard Ristow <email@example.com>|
|Sender: ||"SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>|
|From: ||Richard Ristow <firstname.lastname@example.org>|
|Subject: ||Re: Yet another date problem|
|Content-Type: ||text/plain; charset="us-ascii"; format=flowed|
At 12:18 PM 7/28/2003 +0100, Andrew Percy wrote:
>I am currently working with administrative data on children in social
>care. The data is in the form of a individual record for each
>placement the child has. The key variables are ID, DATE (the date a
>particular placement began, ORDER (the legal order they are under),
>PLACE (the type of placement - foster, adoption, parents etc), REASON
>(the reason why the placement broke down).
>ID DATE ORDER PLACE FOSTER REASON
>001 04.08.1995 Art21 . . New order
>001 05.08.1995 . Foster 20 Planned move
>001 27.10.1996 . Foster 91 Respite
>001 29.10.1995 . Foster 30 Planned move
>001 01.11.1996 . Adoption .
>002 01.06.1999 Sect103 . . New Order
>002 03.06.1999 . Foster 126 Planned move
>002 05.11.2001 . Parent .
>003 31.10.1997 Art21 . . Superseeded
>003 31.10.1997 . Foster 39 Planned move
>003 09.12.1998 . Foster 227 Breakdown
>003 01.06.1999 . Foster 16 Respite
>003 11.06.1999 . Foster 261 Planned move
>I would welcome any pointers as to where to start with this type of data.
>I am survey researcher and used to neat data with one row per
>individual. I would like to transpose the data into a single row per child.
OK, first: DON'T do that. Data like this is usually best handled in the
form you have it, getting summary statistics per person using tools
SPSS provides for that purpose. The most powerful of these is
AGGREGATE. I'd advise,
A. Read your data (save as an SPSS .SAV file, at this point).
1. The date should read fine with format EDATE10.
2. Decide, with ORDER, PLACE and REASON, whether you want to keep them
as string variables or recode them into integer variables with value
labels. The latter is more work, but maybe neater, especially for
"REASON" which is more than 8 characters long.
B. A couple of preparations for analysis (and, again, save):
1. Your data isn't in order by date. So,
. SORT CASES BY ID DATE
2. It looks like you want the duration of each placement, which
requires the ending as well as beginning date of the placement. You
don't have that, but if the beginning of the next placement is the
ending date of the current one, try (CODE NOT TESTED):
. CREATE END_DATE = LEAD(DATE).
. CREATE NEXT_ID = LEAD(ID).
. VARIABLE LABELS END_DATE 'Ending date, i.e. start of next placement.'
. FORMATS END_DATE (EDATE10).
. IF (MISSING(NEXT_ID)) END_DATE = $SYSMIS.
. IF (NEXT_ID NE ID) END_DATE = $SYSMIS.
. SAVE OUTFILE='c:\MY_SPSS\Plcmnt.SAV'
C. Then you're ready to go. I'm doing "AGGREGATE/OUTFILE=*", which
overwrites the SPSS active file, so make SURE you have a saved copy
before doing this.
>I would like to look at things like
> 1. The number of placements they have of each type,
Oddly, one of the more awkward ones using AGGREGATE. (Cf. my posting
"Re: frequecies within aggregate", Mon, 14 Jul 2003 14:55:14 -0400.)
/N_PLCMT 'Number of placements' = N
/P_FOST 'Number of foster placements'
/P_ADPT 'Number of adoption placements'
/P_PRNT 'Number of parent placements'
/* VERY IMPORTANT adjustment to the above: */
DO REPEAT PLCMNT = P_FOST P_ADPT P_PRNT.
. PLCMNT = ROUND(N_PCMNT*PLCMNT).
FORMATS P_FOST P_ADPT P_PRNT (F3).
>2. the reasons for placement move,
Logic like the above, but on variable REASON rather than PLACE. (The
two sets of 'FIN' transformations can be combined in a single AGGREGATE.)
>3. how long the placements lasted
COMPUTE PLCE_LEN = CTIME.DAYS(END_DATE - DATE).
VARIABLE LABELS PLCE_LEN 'Placement length, days'.
(Then, use AGGREGATE to get mean, minimum, maximum, whatever for
>4. the number of placements over fixed periods etc.
Like 1. and 2., but first select for beginning placement dates in the
periods you desire.
>5. I would like to combine this data with other demographic and health
>data that I also have. This is in the standard one line per case format.
The output of AGGREGATE with BREAK=ID will have one record ('line',
SPSS 'case') for each value of ID, and will contain the variable ID.
You can combine it with your demographic and health data with a
standard MATCH FILES/BY=ID.
Is this enough to get going with? Good luck to you!