Date: Fri, 2 Oct 2009 17:43:13 -0700
Reply-To: xlr82sas <xlr82sas@AOL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: xlr82sas <xlr82sas@AOL.COM>
Organization: http://groups.google.com
Subject: Re: How to catch the misspelled names?
Content-Type: text/plain; charset=ISO-8859-1
On Oct 2, 2:53 pm, art...@NETSCAPE.NET (Arthur Tabachneck) wrote:
> Nancy,
>
> In addition to Dan's always sage advice, if and only if you have a rather
> large budget to work with, you can look into a SAS product called dataflux.
>
> See:http://www.dataflux.com
>
> It purports to identify inconsistencies in name, address and similar fields.
>
> Art
> ---------
>
>
>
> On Fri, 2 Oct 2009 13:23:08 -0700, Nancy <nancy0...@GMAIL.COM> wrote:
> >Hello Everyone,
>
> >Is there any software or procesure from SAS or SQL to catch the
> >misspelled name?
>
> >For example:
>
> >Data set:
> >record 1: Nancy Lee
> >records 2: Nanncy Lee
>
> >How we can pick up the these two records as duplicated records?
>
> >Thank you!
>
> >Nancy- Hide quoted text -
>
> - Show quoted text -
for a cleaner version see
http://homepage.mac.com/magdelina/.Public/utl.html
utl_tipweb
T002580 SAS SPELL CHECKER
The public folder has a dictionary called wrd.sas7bdat,with one
variable wrd, and an unique(unix?) index on wrd.
It is easy to build dictionaries using public source on the net see
last line.
data chkspl;
lyn="The qwik bron fox ran over the chicken koop";
output;
lyn="Today is the frst day of the rst of yur life";
output;
run;
libname dic "/home/xxxxxxxx/utl"; /* dictionary is in unix idexed */
%utl_SplChk(utl_dic=dic.wrd,utl_sd1=work.chkspl,utl_var=lyn);
/*
Output
Obs WRD DES
1 BRON Unrecognized
2 FRST Unrecognized
3 KOOP Unrecognized
4 QWIK Unrecognized
5 RST Unrecognized
6 YUR Unrecognized
*/
http://wordlist.sourceforge.net/ good place to find
dictionaries