LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2002)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 25 Jan 2002 13:09:07 -0300
Reply-To:     hmaletta@fibertel.com.ar
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Hector Maletta <hmaletta@fibertel.com.ar>
Subject:      Re: multiple record matching
Comments: To: kmcdonald@dmhmrsas.state.va.us
Content-Type: text/plain; charset=us-ascii

Kathy: Matching many to many requires defining what you exacty want. Let us think of an example that I actually encountered not long ago. In an agricultural survey there were a number of variables concerning the farmers' families (one record per person), some variables concerning the farm and the household (one record per farm/household; each household corresponded to exactly one farm), and some variables referred to specific crops (one record per crop). In this kind of situation I wanted to do the following: 1. Describe each person according to farm/household characteristics (farm size, socioeconomic status, family size, type of housing, etc). 2. Describe farm/households according to personal characteristics of the family members (sex, age, education, etc). 3. Describe farm/households by variables concerning crops grown and their characteristics (area planted, yield, etc), and concerning household characteristics. 4. Describe persons according to characteristics of the crops grown in their farms. The procedures are as follows: 1. MATCH FILES /FILE 'PERSONS.SAV'/TABLE 'FARM_HOUSEHOLDS.SAV'/BY ID. 2. Open PERSONS.SAV. Use AGGREGATE /BY PERSON to create household-level variables based on personal characteristics (e.g.: male members, female members, number of people with higher education, number of children under 18, etc.). Then MATCH the resulting file with FARMS_HOUSEHOLDS.SAV based on household ID. 3. Open CROPS.SAV and use AGGREGATE to create variables at farm/household level based on crops characteristics, e.g. number of crops, total value of production, total area under winter crops, total area under cereals, etc. Then MATCH the resulting file with FARMS_HOUSEHOLDS.SAV based on ID. 4. Use FARMS_HOUSEHOLDS.SAV as modified at steps 2 and 3 above, and apply MATCH FILES /FILE 'PERSONS.SAV'/TABLE 'FARMS_HOUSEHOLDS.SAV'/BY ID.

Directly matching "many to many" (e.g. crops to people) could be done in a different manner. Suppose farms grow no more than, say, three crops each. Crops grown in all farms may be many, but no more than three per farm. Create one file for first crop named, one file for second crop, and one file for third crop. Suppose there is a variable CROPNUM in the CROPS.SAV file valued 1, 2, or 3 for first, second or third crop in the farm.- Then: GET FILE 'CROPS.SAV'. TEMPORARY. SELECT IF (CROPNUM=1). SAVE OUTFILE 'CROP_1.SAV'. USE ALL. TEMPORARY. SELECT IF (CROPNUM=2). SAVE OUTFILE 'CROP_2.SAV'. USE ALL. TEMPORARY. SELECT IF (CROPNUM=3). SAVE OUTFILE 'CROP_3.SAV'.

MATCH FILES /FILE 'FARMS_HOUSEHOLDS.SAV'/FILE 'CROP_1.SAV'/ FILE 'CROP_2.SAV'/FILE 'CROP_3.SAV'/BY ID. SAVE OUTFILE 'FARMS_HOUSEHOLDS_CROPS.SAV'. MATCH FILES /FILE 'PERSONS.SAV'/TABLE 'FARMS_HOUSEHOLDS_CROPS.SAV'/by ID. SAVE OUTFILE 'PERSONS_CROPS.SAV'. The resulting file contains one record per person, with personal variables, farm/household variables, and variables for up to three crops.

There are actually other, more elegant ways to add the three crops' variables to the FARMS_HOUSEHOLDS file; I have used the simplest for the sake of clarity.

Hector Maletta Universidad del Salvador Buenos Aires, Argentina

kmcdonald@dmhmrsas.state.va.us wrote: > > How do you deal with both files having more than one record? > > -----Original Message----- > From: Hector Maletta [mailto:hmaletta@fibertel.com.ar] > Sent: Thursday, January 24, 2002 3:05 PM > To: SPSSX-L@LISTSERV.UGA.EDU > Subject: Re: aggregate > > Jessica: > Use the MATCH FILES commando designating the non-duplicated file as a > TABLE, as follows: > MATCH FILES /TABLE 'FILE 1'/FILE 'FILE 2' /BY id. > > ID is the name I assign to the matching variable. The resulting file > will have one record per case existing in File 2, and each record will > have all variables from File 1 (repeated for all the duplicated cases > with the same ID) plus all variables from File 2 (except those that bear > the same name as in File 1, in which case SPSS prefers taking the values > from the file that is mentioned in the command. You can reverse the > order of the files in the command (always taking FILE 1 as a TABLE), if > you wish that the version in File 2 is preserved in the case of > variables existing in both files. > > Hector Maletta > Universidad del Salvador > Buenos Aires, Argentina > > Wozniak wrote: > > > > I have two files that I am trying to match up on a certain variable. One > file has no duplicates for that variable, the other has more than one > instance of the same variable. > > For example: file one has: > > 1 > > 2 > > 3 > > 4 > > 5 > > > > file two looks like this for that variable > > 1 > > 1 > > 2 > > 3 > > 3 > > 3 > > 4 > > 5 > > 5 > > 5 > > I want to match them up so that the numbers look like this > > > > from File one from File two > > > > 1 1 > > 1 1 > > 2 2 > > 3 3 > > 3 3 > > 3 3 > > 4 4 > > 5 5 > > 5 5 > > 5 5 > > any suggestions would be greatly appreciated. thanks > > Jessica


Back to: Top of message | Previous page | Main SPSSX-L page