LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 1998)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Sun, 5 Jul 1998 16:18:23 MET
Reply-To:   "M. MILLS" <m.mills@FRW.RUG.NL>
Sender:   "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From:   "M. MILLS" <m.mills@FRW.RUG.NL>
Subject:   Linking files revisited

Hi once again, I would like to restate a question regarding linking files, this time in a less narrative manner! Sorry for the confusion. I have two files, household and individual. I will explain what they look like first and then what I wish to do with them.

1. HOUSEHOLD DATA FILE 9 digit hhld id aaabbbccc (id)..hhwater(etc)..hlin$01..hrel$01(etc)..hlin$02...hrel$02 aaabbbddd(id)..hhwater(etc)..hlin$01..hrel$01(etc)..hlin$02...hrel$02 eeefffggg (id)..hhwater(etc)..hlin$01..hrel$01(etc)..hlin$02...hrel$02 eeefffhhh (id)..hhwater(etc)..hlin$01..hrel$01(etc)..hlin$02...hrel$02

The first variable is a 9 digit household identification (these three variables are also available separately in both of the data files as hhstate=aaa, hhtown=bbb, hhnumber=ccc). This id is followed by a number of household-specific variables (heating, water source, etc.). Then the file has information divided by each individual member, up to 38 household members. NOTE: The household file was used as a 'filter' to get basic household information and then to isolate only ever-married women between reproductive ages. Therefore, not all individuals proceed to the individual file (e.g., not men, older and younger women), and there may be multiple women from one household.

2. INDIVIDUAL DATA FILE 11 digit case id aaabbbcccxx (id)...state...town...number...line...education...work etc. aaabbbdddzz (id)...state...town...number...line...education...work etc. eeefffgggyy (id)...state...town...number...line...education...work etc. eeefffhhhww (id)...state...town...number...line...education...work etc.

In this file, the individual case identification has an *additional* 2-digit variable of 'line number' (in other words the raw text data was probably entered as hhld being the first line and each individual sequentially thereafter). ***HERE IS THE CLINCHER.... For example, if in the Individual Data File 'line=03 (which is also the last 2 digits in the 11 digit case id#)', then all of the information for this 3rd person in the household is under the variables, 'hhlin$03, hhrel$03', etc. in the Household Data File. Once again, each variable from the case id is available separately in the individual data file under slightly different names (state=aaa, town=bbb, number=ccc, line=xx).

WHAT I WANT TO DO. What I want to do is to link all of this 'basic' (e.g., water, heating) AND 'individual-specific' household data (e.g., relation to head of household, etc.), with each individual woman in the individual data file. Therefore, for the first woman in 'state=aaa /town=bbb /household=ccc/line=xx I want to link her to her household information of 'state=aaa/ town=bbb, /household=ccc and the: a) basic household variables listed only once (hhwater, etc.); and, b) the individual-specific household variables of, 'hhline$xx, hhrel$xx, etc.' This is both needed for information, but also to de-select multiple women from one household during regression in order not to violate the assumption of the independence of observations (but, that's another topic!). I have all of the pieces of the puzzle, but can't seem to put it together! I hope this is clearer - any suggestions are welcome. Thanks for taking the time to read my question. Melinda


Back to: Top of message | Previous page | Main SPSSX-L page