Date: Fri, 8 Dec 2006 12:03:14 -0800
Reply-To: jlgoldberg@BRICK.NET
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jonathan Goldberg <jlgoldberg@BRICK.NET>
Organization: http://groups.google.com
Subject: Re: finding a string
In-Reply-To: <1165574706.266949.219190@16g2000cwy.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"
Alves:
THIS is a job for REGULAR EXPRESSIONS!
SAS supports PERL regular expressions; their use is documented in the
PRX (Perl Regular EXpression) group of functions and call routines.
If you're not familiar with them you're probably working too hard.
In this case, a regular expression for date might be:
\d{1,2}[-/]+\d{1.2)[-/]+\d{2,4}
which means:
\d = any digit
[-/] is a class containing the date separators - and /
+ = occurs once
{number, number} = occurs lower number to higher number times
so, the whole expression means
one or two numbers followed by - or / followed by one or two numbers
followed by - or / followed by 2 to four numbers
which describes one of your dates closely enough.
Put it in parentheses, use the prxparse function to compile it, and the
prxposn function to return the string value (free it when you're done
with the prxfree function). Or, use prxmatch to find the string
location and other functions to extract and manipulate it.
Jonathan
alves wrote:
> Hi everyone.
>
> Quick question.
>
> I have a dataser of approx. 10.000.000 observations that has 2
> variables. CODE and DESCRIPTION. do not know why, but the person who
> created this dataset instead of creating a new variable to extra
> information, just added it to the DESCRIPTION. I have a list of the
> most common strings that were added and I manage to filter them out. My
> problem is with dates. An example.
>
> CODE DESC
> A100 AAAAAA0 PRODUCTION - STOP 4/06/2006
> A101 AAAAAA1 NO LONGER PRODUCED 04/7/06
> A102 AAAAAA2 STOP - 4-7-06
> A103 AAAAAA3 04-7-06 PRODUCTION STOP
>
> This is a simplified version, AAAAAA can be any string (a product
> name).
>
> the date can basically appear in any format and anywhere!! after
> anything... the only reference point I have is the "/" or "-" ...
>
> So after looking to this mess, I want to do two things. 1) Clean it; 2)
> Kill the person who did it, after a long torture!!!!
>
> Thanks in advance.