LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 16 Jan 2008 13:06:14 -0500
Reply-To:     Gene Maguin <>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Gene Maguin <>
Subject:      Re: Restructuring a fuzzy matched data set
In-Reply-To:  <>
Content-Type: text/plain; charset="us-ascii"


>>I am trying to unduplicate a file using a fuzzy match approach via the CDC software LinkPlus. The issue I'm having is trying to restructure the data set so all Ids that match appear on one row in the. For example LinkPlus spits out a data set (actually a report) that has all pairs of matches:

Match# ID Name 1 27 Carl 1 42 Carl 2 27 Carl 2 53 Carl 3 42 Carl 3 53 Carl 4 18 Sue 4 99 Sue

I'd love to have a data set that looks like this:

Match1 Match2 Match3 27 42 53 18 99

I think I'd work the problem this way.

Sort cases by name id.

Compute dups=0. If (id eq lag(id)) dups=lag(dups)+1.

Select if (dups eq 0). * Now you have a set of unique ids within a name value. * Use Casestovars to build a single record.

Casestovars /id=name.

I really never use casestovars so you may need to fiddle around with it a bit but I think you will get records consisting of Name Match1 Match2 ... MatchN ID1 ID2 ... IDN

You will be interested in the variables ID1 thru IDN.

Gene Maguin

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Back to: Top of message | Previous page | Main SPSSX-L page