Date: Fri, 16 Mar 2007 11:11:06 -0400
Reply-To: ben.powell@CLA.CO.UK
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: ben.powell@CLA.CO.UK
Subject: Re: Subsetting data based on similar sounding names
Do you really want SOUNDEX? What if there is a typo or abbreviation in the
name: they will not sound similar. Perhaps SOUNDEX would be best used in
conjunction with something like SPEDIS - to measure the "distance" between
words (names).
Also, "But why do you want to do that?" ...
If you've got no data to support those names' identity they may or may not
refer to the same person, so grouping together similar names may just be
obfuscating the issue...
HTH.
|