Date: Wed, 28 Mar 2012 17:54:28 -0700
Reply-To: Bruce Weaver <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Bruce Weaver <firstname.lastname@example.org>
Subject: Re: Database Management Help
Content-Type: text/plain; charset=us-ascii
I *think* he wants one row for each unique ID x Date combination. If he
happens to have both BMI and Depression data for that row, both variables
will have valid values. On other rows (most of them), only one of the two
variables will have a valid score. But yes...it is ESPeculation on my part.
David Marso wrote
> Impossible to tell *What* Paul wants as the endgame. My basic claim is
> that it will be much easier to "remove all the BMI data for this subject
> since they have no depression data." If the depression file is
> -denormalized/flattened/casetovar'd- and used as a TABLE into the 'BMI
> data'. NUKE the unassociated cases then VARSTOCASES if you want to use a
> hammer or some flavor of VECTOR/LOOP/XSAVE if one is into tweezers. Since
> we are ESPssPeculating at this point I will refrain from any further
> guessing pending further reply. ADD w /INs ->FLAG followed by AGGREGATE
> -MODE ADDVAR MAX(FLAG) will also work but what should be associated with
> what? ALL BMI wth ALL depression? Most recent? Before? After? BOTH?
> Maybe start with ALL<-> ALL and then let the elves sort the fairy dust
> after the fact ;-).
> "I understood this to mean that Paul wants the final file to have multiple
> rows per ID, not one row per ID. David's method results in the latter, I
> Bruce Weaver wrote
>> Paul (the OP) wrote:
>> "For example I might have one subject whose BMI was collected 10 times on
>> different dates and none of these dates match the depression data for
>> this subject. So after the merge I have eleven rows for this subject that
>> I would like to preserve (all BMI data and depression data). Another
>> subject may have BMI scores collected at 7 different dates, but this
>> subject has no depression data. I would like to remove all the BMI data
>> for this subject since they have no depression data."
>> I understood this to mean that Paul wants the final file to have multiple
>> rows per ID, not one row per ID. David's method results in the latter, I
>> This illustrates once again how helpful it is to post small examples
>> showing what the file looks like originally, and what you want it to look
>> like afterward! ;-)
>> David Marso wrote
>>> Quick and dirty would be to do
>>> 1. CASESTOVARS on both files (use different varnames for the dates in
>>> the 2 files).
>>> 2. Simple 1:1 Match at the point.
>>> 3. Let the devil sort it out later with some basic logic after nuking
>>> the obvious crap.
>>> Michael, Paul G. wrote
>>>> Hi All,
>>>> I have two data sets that I would like to merge using ID as the keyed
>>>> variable, and each data file has duplicates IDs. The variables of
>>>> interest in the first dataset are ID, date of contact (DD-MM-YYYY), and
>>>> Body Mass Index (BMI). The variables of interest in the second data set
>>>> are ID, date of contact (DD-MM-YYYY), and depression score. The
>>>> duplicate IDs occur because some subjects had more than 1 BMI score
>>>> from different time points and/or more than 1 depression score at
>>>> different time points.
>>>> The contact dates from each file do not match up in all instances (in
>>>> fact very few subjects have the same contact date in both files). I
>>>> only want to keep subjects who have both BMI data and depression data
>>>> but I need to preserve information from all the contact dates.
>>>> When I try a simple merge by adding variables (e.g., depression score
>>>> and contact date) to the BMI data set using ID as the keyed variable, I
>>>> run into the problem of having to go through thousands of cases and
>>>> delete those that don't have both BMI and depression data.
>>>> For example I might have one subject whose BMI was collected 10 times
>>>> on different dates and none of these dates match the depression data
>>>> for this subject. So after the merge I have eleven rows for this
>>>> subject that I would like to preserve (all BMI data and depression
>>>> data). Another subject may have BMI scores collected at 7 different
>>>> dates, but this subject has no depression data. I would like to remove
>>>> all the BMI data for this subject since they have no depression data.
>>>> Is there a way in which I can merge these files in a different way to
>>>> get what I need or a method to delete cases in the merged data file
>>>> based on duplicate IDs and BMI data but no depression data? Any help is
>>>> greatly appreciated!
>>>> To manage your subscription to SPSSX-L, send a message to
>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
>>>> command. To leave the list, send the command
>>>> SIGNOFF SPSSX-L
>>>> For a list of commands to manage subscriptions, send the command
>>>> INFO REFCARD
"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Database-Management-Help-tp5601508p5602258.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command