LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 14 Mar 2011 14:09:56 -0700
Reply-To:   Wei Wang <weiwangum@YAHOO.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Wei Wang <weiwangum@YAHOO.COM>
Subject:   Re: Find Different Names
Comments:   To: Joe Matise <snoopy369@GMAIL.COM>
In-Reply-To:   <AANLkTi==kF=MJRYm9cwM4-Tohu--pCa1ShUamvjytdQA@mail.gmail.com>
Content-Type:   text/plain; charset=iso-8859-1

Thanks Joe. In this case, as long as there are any different letters between two names they are treated as different names.  For instance, Danny and Dan are two different names.   Wei

--- On Mon, 3/14/11, Joe Matise <snoopy369@GMAIL.COM> wrote:

From: Joe Matise <snoopy369@GMAIL.COM> Subject: Re: Find Different Names To: SAS-L@LISTSERV.UGA.EDU Date: Monday, March 14, 2011, 3:53 PM

Wei, This is a fairly trivial task if you're just comparint identical (Joe, Joe), and even if it's just shortenings (Joe, Joey; Dan, Danny).  However, Sean-Shawn begins to add a significant layer of complexity to it - how would you tell a computer algorithmically those are identical?  Same goes for "Marge/Margaret", "John/Jack", "Chris/Krystal", etc., particularly when you have some that are sometimes valid and sometimes invalid (Chris/Krystal for example).  Some of the luminaries on SAS-L have addressed this sort of issue in the past, and hopefully they can assist you (or you can google the listserv's history for 'fuzzy matching').  In general, I would say that the easiest way to approach this is one of the SOUNDEX type functions, but any of those will have some risk of false positives and false negatives - the only true way to do this is to make the list by hand.

-Joe

On Mon, Mar 14, 2011 at 3:39 PM, Wei Wang <weiwangum@yahoo.com> wrote:

> Hi guys, > > data have; > infile datalines missover; > input id name1 $char8. name2 $char8. name3 $char8.; > datalines; > 1 mike    mike    mike > 2 joe     joe     joey > 3         andy    andy > 4 > 5         danny   dan > 6 sean    shawn > 7 aaron > ; > run; > > I want to create a flag varialbe indicating different non-missing > names. Here is the data I need. > > id name1  name2   name3   flag > 1 mike    mike    mike    0 > 2 joe     joe     joey    1 > 3         andy    andy    0 > 4                         0 > 5         danny   dan     1 > 6 sean    shawn           1 > 7 aaron                   0 > > Thanks, > Wei > > > >


Back to: Top of message | Previous page | Main SAS-L page