LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 14 Mar 2011 16:26:06 -0500
Reply-To:   Yu Zhang <zhangyu05@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Yu Zhang <zhangyu05@GMAIL.COM>
Subject:   Re: Find Different Names
Comments:   To: Wei Wang <weiwangum@yahoo.com>
In-Reply-To:   <882042.95989.qm@web114719.mail.gq1.yahoo.com>
Content-Type:   text/plain; charset=ISO-8859-1

given your data, i think the code would do what you want. I changed option from missover to truncover.

data have; infile datalines truncover; input id name1 $char8. name2 $char8. name3 $char8.; flag=(name1 max name2 max name3)=(name1 min name2 min name3); datalines; 1 mike mike mike 2 joe joe joey 3 andy andy 5 danny dan 6 sean shawn 7 aaron ; run; proc print;run;

On Mon, Mar 14, 2011 at 4:09 PM, Wei Wang <weiwangum@yahoo.com> wrote:

> Thanks Joe. In this case, as long as there are any different > letters between two names they are treated as different names. For > instance, Danny and Dan are two different names. > > Wei > > --- On Mon, 3/14/11, Joe Matise <snoopy369@GMAIL.COM> wrote: > > > From: Joe Matise <snoopy369@GMAIL.COM> > Subject: Re: Find Different Names > To: SAS-L@LISTSERV.UGA.EDU > Date: Monday, March 14, 2011, 3:53 PM > > > Wei, > This is a fairly trivial task if you're just comparint identical (Joe, > Joe), > and even if it's just shortenings (Joe, Joey; Dan, Danny). However, > Sean-Shawn begins to add a significant layer of complexity to it - how > would > you tell a computer algorithmically those are identical? Same goes for > "Marge/Margaret", "John/Jack", "Chris/Krystal", etc., particularly when you > have some that are sometimes valid and sometimes invalid (Chris/Krystal for > example). Some of the luminaries on SAS-L have addressed this sort of > issue > in the past, and hopefully they can assist you (or you can google the > listserv's history for 'fuzzy matching'). In general, I would say that the > easiest way to approach this is one of the SOUNDEX type functions, but any > of those will have some risk of false positives and false negatives - the > only true way to do this is to make the list by hand. > > -Joe > > On Mon, Mar 14, 2011 at 3:39 PM, Wei Wang <weiwangum@yahoo.com> wrote: > > > Hi guys, > > > > data have; > > infile datalines missover; > > input id name1 $char8. name2 $char8. name3 $char8.; > > datalines; > > 1 mike mike mike > > 2 joe joe joey > > 3 andy andy > > 4 > > 5 danny dan > > 6 sean shawn > > 7 aaron > > ; > > run; > > > > I want to create a flag varialbe indicating different non-missing > > names. Here is the data I need. > > > > id name1 name2 name3 flag > > 1 mike mike mike 0 > > 2 joe joe joey 1 > > 3 andy andy 0 > > 4 0 > > 5 danny dan 1 > > 6 sean shawn 1 > > 7 aaron 0 > > > > Thanks, > > Wei > > > > > > > > > > > > >


Back to: Top of message | Previous page | Main SAS-L page