Date: Wed, 9 Nov 2005 10:07:42 -0500
Reply-To: "Fehd, Ronald J" <rjf2@CDC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Fehd, Ronald J" <rjf2@CDC.GOV>
Subject: Re: Ideas for cleaning up names
Content-Type: text/plain; charset="US-ASCII"
DataFlux was made for this problem
http://support.sas.com/rnd/warehousing/cleanse/
Ron Fehd the macro maven CDC Atlanta GA USA RJF2 at cdc dot gov
> -----Original Message-----
> From: Tonkovich, Mike
> I've got 14,000 first names and many of them look like the following:
>
> TOM%J
> JONAS
> ALFFRED G CARTER JR
> C. ROBERT
>
> These are all values for first name. You'll notice that the
> first has a
> special character, the 2nd starts with a blank space the 3rd
> contains 4
> separate words and the 4th a first initial and last name. I thought
> about using the substr funtion and pulling each character 1 at a time
> and then sorting on each newly created character and deleting those
> records that have special characters (.,%, etc) and those that have
> missing values (this would get rid of #3). It seems like a fairly
> primitive approach, but I can't think of anything else to
> try. I would greatly appreciate any suggestions you may have.
|