| Date: | Wed, 9 Nov 2005 09:16:08 -0500 |
| Reply-To: | "Tonkovich, Mike" <Mike.Tonkovich@DNR.STATE.OH.US> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "Tonkovich, Mike" <Mike.Tonkovich@DNR.STATE.OH.US> |
| Subject: | Ideas for cleaning up names |
| Content-Type: | text/plain; charset="US-ASCII" |
Folks,
I've got 14,000 first names and many of them look like the following:
TOM%J
JONAS
ALFFRED G CARTER JR
C. ROBERT
These are all values for first name. You'll notice that the first has a
special character, the 2nd starts with a blank space the 3rd contains 4
separate words and the 4th a first initial and last name. I thought
about using the substr funtion and pulling each character 1 at a time
and then sorting on each newly created character and deleting those
records that have special characters (.,%, etc) and those that have
missing values (this would get rid of #3). It seems like a fairly
primitive approach, but I can't think of anything else to try. I would
greatly appreciate any suggestions you may have.
Thanks in advance for your help.
Mike
Michael J. Tonkovich, Ph.D.
Wildlife Research Biologist
ODNR, Division of Wildlife
360 E. State St.
Athens, OH 45701
v(740)589.9920 f(740)589.9925
mike.tonkovich@dnr.state.oh.us
|