Date: Tue, 16 Nov 1999 18:35:39 -0500
Reply-To: andrea.wainwright@CAPITALONE.COM
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Andrea Wainwright <andrea.wainwright@CAPITALONE.COM>
Subject: Re: Proper Case address for SAS mailing
Content-Type: text/plain; charset="US-ASCII"
Another thing that needs to be taken in to account are names like
John III. The quite often come through the mail as John Iii.
______________________________ Reply Separator _________________________________
Subject: Re: Proper Case address for SAS mailing
Author: Mark Bodt <markbodt@STSS.CO.NZ> at Internet
Date: 11/17/1999 9:11 AM
Alan,
you will probably get several replies with macro code which will address
your problem to a degree.
I have also written a macro many years ago (in the days of 6.06) to do
this. It was for an HR dept, also for a mail out to several thousand
employees. The department used a legacy system which held all data in
uppercase.
In this posting, I want to make you aware of one of the problems that we
faced.
As people have a whole range of surnames and come from different ethnic
backgrounds, it is not correct to assume that a name will always have the
capitalisation that is common
e.g. JOHN SMITH->John Smith
People are very particular about their names and it is important to spell
them correctly, but also to capitalise them correctly. After our first mail
out we received a lot of feedback requesting that the capitalisation be
improved.
You can make generalisations about the rules e.g if the first two letters
of a word are Mc as in McDonald, then the third letter should be a capital.
However these are only generalisations and cannot be used as rules.
For example MacDonald could have the rule- If the first 3 letters are mac
then make the 4th a capital. But in fact this rule would not work for a
name like Macey. And some MacDonalds actually spell their name Macdonald.
Ethnic diversity also adds complications: deSilva
My solution was to build a macro converted words in the following order. If
the word was not converted in a step then the next step was performed:
1) A custom dictionary which held exceptions based on a key (say the
employment number) this handled the Macdonald or MacDonald situation, but
also for special abbreviations - particularly in addresses for example: HR
Dept, not Hr Dept. 'LAX' (for Los Angeles) not 'Lax'
2) Also in that dictionary, non-standard words were held e.g. deSilva,
Macey (exception to the Mac<Capital> rule)
3) Rules applied e.g. first 2 letters =Mc then third letter is capiatalised
4) Capitalise first letter, lowercase the rest of the word.
Obviously there was a lot of processing involved as it was necessary to not
only process each word in a string, but also down to each letter. However
it was particularly important to 'get it right' and so we were prepared to
go to these lengths.
The result was very good, and this macro was used not only for mail outs,
but also for interfacing data to other systems which held data in
upper/lowercase.
HTH
Mark
+------------------------------------------+--------------------------+
| Mark Bodt | |
| Sunken Treasure Software Systems Ltd | SAS Institute(NZ) Ltd. |
| Specialising in SAS(R) Software | Quality Partner. |
| Consultancy in the Asia / Pacific Region | |
+------------------------------------------+--------------------------+
| PO Box 9472, Marion Square, Wellington, New Zealand |
| Ph (025) 725 386 Fax +64 4 385 8670 Email: markbodt@stss.co.nz |
+---------------------------------------------------------------------+