LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2012)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 28 Mar 2012 16:45:42 -0700
Reply-To:     David Marso <david.marso@gmail.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         David Marso <david.marso@gmail.com>
Subject:      Re: Database Management Help
In-Reply-To:  <1332975351700-5602074.post@n5.nabble.com>
Content-Type: text/plain; charset=us-ascii

Impossible to tell *What* Paul wants as the endgame. My basic claim is that it will be much easier to "remove all the BMI data for this subject since they have no depression data." If the depression file is -denormalized/flattened/casetovar'd- and used as a TABLE into the 'BMI data'. NUKE the unassociated cases then VARSTOCASES if you want to use a hammer or some flavor of VECTOR/LOOP/XSAVE if one is into tweezers. Since we are ESPssPeculating at this point I will refrain from any further guessing pending further reply. ADD w /INs ->FLAG followed by AGGREGATE -MODE ADDVAR MAX(FLAG) will also work but what should be associated with what? ALL BMI wth ALL depression? Most recent? Before? After? BOTH? Maybe start with ALL<-> ALL and then let the elves sort the fairy dust after the fact ;-).

"I understood this to mean that Paul wants the final file to have multiple rows per ID, not one row per ID. David's method results in the latter, I think."

Bruce Weaver wrote > > Paul (the OP) wrote: > > "For example I might have one subject whose BMI was collected 10 times on > different dates and none of these dates match the depression data for this > subject. So after the merge I have eleven rows for this subject that I > would like to preserve (all BMI data and depression data). Another subject > may have BMI scores collected at 7 different dates, but this subject has > no depression data. I would like to remove all the BMI data for this > subject since they have no depression data." > > I understood this to mean that Paul wants the final file to have multiple > rows per ID, not one row per ID. David's method results in the latter, I > think. > > This illustrates once again how helpful it is to post small examples > showing what the file looks like originally, and what you want it to look > like afterward! ;-) > > > > > David Marso wrote >> >> Quick and dirty would be to do >> 1. CASESTOVARS on both files (use different varnames for the dates in the >> 2 files). >> 2. Simple 1:1 Match at the point. >> 3. Let the devil sort it out later with some basic logic after nuking the >> obvious crap. >> -- >> >> >> Michael, Paul G. wrote >>> >>> Hi All, >>> >>> I have two data sets that I would like to merge using ID as the keyed >>> variable, and each data file has duplicates IDs. The variables of >>> interest in the first dataset are ID, date of contact (DD-MM-YYYY), and >>> Body Mass Index (BMI). The variables of interest in the second data set >>> are ID, date of contact (DD-MM-YYYY), and depression score. The >>> duplicate IDs occur because some subjects had more than 1 BMI score from >>> different time points and/or more than 1 depression score at different >>> time points. >>> >>> The contact dates from each file do not match up in all instances (in >>> fact very few subjects have the same contact date in both files). I only >>> want to keep subjects who have both BMI data and depression data but I >>> need to preserve information from all the contact dates. >>> >>> When I try a simple merge by adding variables (e.g., depression score >>> and contact date) to the BMI data set using ID as the keyed variable, I >>> run into the problem of having to go through thousands of cases and >>> delete those that don't have both BMI and depression data. >>> >>> For example I might have one subject whose BMI was collected 10 times on >>> different dates and none of these dates match the depression data for >>> this subject. So after the merge I have eleven rows for this subject >>> that I would like to preserve (all BMI data and depression data). >>> Another subject may have BMI scores collected at 7 different dates, but >>> this subject has no depression data. I would like to remove all the BMI >>> data for this subject since they have no depression data. >>> >>> Is there a way in which I can merge these files in a different way to >>> get what I need or a method to delete cases in the merged data file >>> based on duplicate IDs and BMI data but no depression data? Any help is >>> greatly appreciated! >>> >>> Best, >>> >>> Paul >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >> >

-- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Database-Management-Help-tp5601508p5602154.html Sent from the SPSSX Discussion mailing list archive at Nabble.com.

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page