LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2012)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 28 Mar 2012 17:58:34 +0000
Reply-To:     "Poes, Matthew Joseph" <mpoes@illinois.edu>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         "Poes, Matthew Joseph" <mpoes@illinois.edu>
Subject:      Re: Database Management Help
Comments: To: "Michael, Paul G." <mich0231@pacificu.edu>
In-Reply-To:  <8D94A946E17E9941AE94FC3AE11CAB7E043FBB6CFD@everest.ad.pacificu.edu>
Content-Type: text/plain; charset="iso-8859-1"

I think your only solution is to merge by adding cases (rather than variables) and stack the data set.

If you needed a wide dataset, once all the data is there, you could transpose on ID.

Matthew J Poes Research Data Specialist Center for Prevention Research and Development University of Illinois 510 Devonshire Dr. Champaign, IL 61820 Phone: 217-265-4576 email: mpoes@illinois.edu

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Michael, Paul G. Sent: Wednesday, March 28, 2012 12:54 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Database Management Help

Hi All,

I have two data sets that I would like to merge using ID as the keyed variable, and each data file has duplicates IDs. The variables of interest in the first dataset are ID, date of contact (DD-MM-YYYY), and Body Mass Index (BMI). The variables of interest in the second data set are ID, date of contact (DD-MM-YYYY), and depression score. The duplicate IDs occur because some subjects had more than 1 BMI score from different time points and/or more than 1 depression score at different time points.

The contact dates from each file do not match up in all instances (in fact very few subjects have the same contact date in both files). I only want to keep subjects who have both BMI data and depression data but I need to preserve information from all the contact dates.

When I try a simple merge by adding variables (e.g., depression score and contact date) to the BMI data set using ID as the keyed variable, I run into the problem of having to go through thousands of cases and delete those that don't have both BMI and depression data.

For example I might have one subject whose BMI was collected 10 times on different dates and none of these dates match the depression data for this subject. So after the merge I have eleven rows for this subject that I would like to preserve (all BMI data and depression data). Another subject may have BMI scores collected at 7 different dates, but this subject has no depression data. I would like to remove all the BMI data for this subject since they have no depression data.

Is there a way in which I can merge these files in a different way to get what I need or a method to delete cases in the merged data file based on duplicate IDs and BMI data but no depression data? Any help is greatly appreciated!

Best,

Paul

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page