Date: Fri, 13 Apr 2007 12:01:01 -0400
Reply-To: Chang Chung <chang_y_chung@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Chang Chung <chang_y_chung@HOTMAIL.COM>
Subject: Re: A DATASET QUESTION - UPDATED PROBLEM - NOT A CLEANED DATA
On Fri, 13 Apr 2007 11:27:38 -0400, data _null_; <datanull@GMAIL.COM> wrote:
>Seems like a simpler regex would suffice. But I am surely overlooking
>something.
>
>data work.one;
> retain rx;
> if _n_ then rx = prxparse('(\bONGOING|ONGO|O/G\b)');
> input subject var &$50.;
> OnGoing = (prxmatch(rx,var) gt 0);
...
Hi, data _null_:
Nice! I forgot about \b! The technical definition of the word boundary (\b)
in perl regex is that it matches in between \w and \W, where the imaginary
beginning and ending of a string as \W.
(http://search.cpan.org/dist/perl/pod/perlre.pod)
Now since \w is equivalent to [_A-Za-z0-9], this is not exactly the same as
mine. For example, yours will not match say, "_ONGO_". But if you try the
above it does match!! This is since the prx should really be:
Yours as it is now: (\bONGOING|ONGO|O/G\b)
Probably your intention: \b(ONGOING|ONGO|O/G)\b
Well, since we are on it: I think you meant "if _n_ = 1" instead of "if _n_"
since the latter is true all the time. Also, I am using the i switch to
perl regex in order to ignore the case, but yours doesn't. :-)
Cheers,
Chang
|