Date: Wed, 28 Mar 2012 17:58:34 +0000
Reply-To: "Poes, Matthew Joseph" <firstname.lastname@example.org>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: "Poes, Matthew Joseph" <email@example.com>
Subject: Re: Database Management Help
Content-Type: text/plain; charset="iso-8859-1"
I think your only solution is to merge by adding cases (rather than variables) and stack the data set.
If you needed a wide dataset, once all the data is there, you could transpose on ID.
Matthew J Poes
Research Data Specialist
Center for Prevention Research and Development
University of Illinois
510 Devonshire Dr.
Champaign, IL 61820
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Michael, Paul G.
Sent: Wednesday, March 28, 2012 12:54 PM
Subject: Database Management Help
I have two data sets that I would like to merge using ID as the keyed variable, and each data file has duplicates IDs. The variables of interest in the first dataset are ID, date of contact (DD-MM-YYYY), and Body Mass Index (BMI). The variables of interest in the second data set are ID, date of contact (DD-MM-YYYY), and depression score. The duplicate IDs occur because some subjects had more than 1 BMI score from different time points and/or more than 1 depression score at different time points.
The contact dates from each file do not match up in all instances (in fact very few subjects have the same contact date in both files). I only want to keep subjects who have both BMI data and depression data but I need to preserve information from all the contact dates.
When I try a simple merge by adding variables (e.g., depression score and contact date) to the BMI data set using ID as the keyed variable, I run into the problem of having to go through thousands of cases and delete those that don't have both BMI and depression data.
For example I might have one subject whose BMI was collected 10 times on different dates and none of these dates match the depression data for this subject. So after the merge I have eleven rows for this subject that I would like to preserve (all BMI data and depression data). Another subject may have BMI scores collected at 7 different dates, but this subject has no depression data. I would like to remove all the BMI data for this subject since they have no depression data.
Is there a way in which I can merge these files in a different way to get what I need or a method to delete cases in the merged data file based on duplicate IDs and BMI data but no depression data? Any help is greatly appreciated!
To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command