LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (August 1998, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 6 Aug 1998 11:39:16 -0400
Reply-To:     RHOADSM1 <RHOADSM1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         RHOADSM1 <RHOADSM1@WESTAT.COM>
Subject:      Re: Data categorization for fuzzy match?
Comments: To: "Self; Karsten" <kself@VISA.COM>
Content-Type: text/plain; charset=US-ASCII

Karsten,

Would an off-the-shelf matching package be an option? This is a common enough problem that canned packages do exist, and I expect that they address your issues. I have heard good things about AutoMatch -- more info is available from www.matchware.com.

Mike Rhoads Westat RhoadsM1@Westat.com

<<original posting>> I am comparing methods of matching membership data based on several key and demographic fields in very large datasets (100m + records).

I need to find ways of restricting the number of potential matches. I am looking for ideas or references to:

- Hash or key numeric fields such that transposes and near-misses are keyed with identical or similar values. Should be suitable for SSN.

- Hash or key text fields so that they may be searched readily for similar words and/or text elements. Should be suitable for name and address data.

Thanks. -- Karsten M. Self (kself@visa.com) Trilogy Consulting

What part of "Gestalt" don't you understand?


Back to: Top of message | Previous page | Main SAS-L page