| Date: | Mon, 14 Mar 2011 16:26:06 -0500 |
| Reply-To: | Yu Zhang <zhangyu05@GMAIL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Yu Zhang <zhangyu05@GMAIL.COM> |
| Subject: | Re: Find Different Names |
|
| In-Reply-To: | <882042.95989.qm@web114719.mail.gq1.yahoo.com> |
| Content-Type: | text/plain; charset=ISO-8859-1 |
|---|
given your data, i think the code would do what you want. I changed option
from missover to truncover.
data have;
infile datalines truncover;
input id name1 $char8. name2 $char8. name3 $char8.;
flag=(name1 max name2 max name3)=(name1 min name2 min name3);
datalines;
1 mike mike mike
2 joe joe joey
3 andy andy
5 danny dan
6 sean shawn
7 aaron
;
run;
proc print;run;
On Mon, Mar 14, 2011 at 4:09 PM, Wei Wang <weiwangum@yahoo.com> wrote:
> Thanks Joe. In this case, as long as there are any different
> letters between two names they are treated as different names. For
> instance, Danny and Dan are two different names.
>
> Wei
>
> --- On Mon, 3/14/11, Joe Matise <snoopy369@GMAIL.COM> wrote:
>
>
> From: Joe Matise <snoopy369@GMAIL.COM>
> Subject: Re: Find Different Names
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Monday, March 14, 2011, 3:53 PM
>
>
> Wei,
> This is a fairly trivial task if you're just comparint identical (Joe,
> Joe),
> and even if it's just shortenings (Joe, Joey; Dan, Danny). However,
> Sean-Shawn begins to add a significant layer of complexity to it - how
> would
> you tell a computer algorithmically those are identical? Same goes for
> "Marge/Margaret", "John/Jack", "Chris/Krystal", etc., particularly when you
> have some that are sometimes valid and sometimes invalid (Chris/Krystal for
> example). Some of the luminaries on SAS-L have addressed this sort of
> issue
> in the past, and hopefully they can assist you (or you can google the
> listserv's history for 'fuzzy matching'). In general, I would say that the
> easiest way to approach this is one of the SOUNDEX type functions, but any
> of those will have some risk of false positives and false negatives - the
> only true way to do this is to make the list by hand.
>
> -Joe
>
> On Mon, Mar 14, 2011 at 3:39 PM, Wei Wang <weiwangum@yahoo.com> wrote:
>
> > Hi guys,
> >
> > data have;
> > infile datalines missover;
> > input id name1 $char8. name2 $char8. name3 $char8.;
> > datalines;
> > 1 mike mike mike
> > 2 joe joe joey
> > 3 andy andy
> > 4
> > 5 danny dan
> > 6 sean shawn
> > 7 aaron
> > ;
> > run;
> >
> > I want to create a flag varialbe indicating different non-missing
> > names. Here is the data I need.
> >
> > id name1 name2 name3 flag
> > 1 mike mike mike 0
> > 2 joe joe joey 1
> > 3 andy andy 0
> > 4 0
> > 5 danny dan 1
> > 6 sean shawn 1
> > 7 aaron 0
> >
> > Thanks,
> > Wei
> >
> >
> >
> >
>
>
>
>
>
|