LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2006, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 8 Dec 2006 12:03:14 -0800
Reply-To:     jlgoldberg@BRICK.NET
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jonathan Goldberg <jlgoldberg@BRICK.NET>
Organization: http://groups.google.com
Subject:      Re: finding a string
Comments: To: sas-l@uga.edu
In-Reply-To:  <1165574706.266949.219190@16g2000cwy.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"

Alves:

THIS is a job for REGULAR EXPRESSIONS!

SAS supports PERL regular expressions; their use is documented in the PRX (Perl Regular EXpression) group of functions and call routines. If you're not familiar with them you're probably working too hard.

In this case, a regular expression for date might be:

\d{1,2}[-/]+\d{1.2)[-/]+\d{2,4}

which means:

\d = any digit [-/] is a class containing the date separators - and / + = occurs once {number, number} = occurs lower number to higher number times

so, the whole expression means one or two numbers followed by - or / followed by one or two numbers followed by - or / followed by 2 to four numbers

which describes one of your dates closely enough.

Put it in parentheses, use the prxparse function to compile it, and the prxposn function to return the string value (free it when you're done with the prxfree function). Or, use prxmatch to find the string location and other functions to extract and manipulate it.

Jonathan

alves wrote: > Hi everyone. > > Quick question. > > I have a dataser of approx. 10.000.000 observations that has 2 > variables. CODE and DESCRIPTION. do not know why, but the person who > created this dataset instead of creating a new variable to extra > information, just added it to the DESCRIPTION. I have a list of the > most common strings that were added and I manage to filter them out. My > problem is with dates. An example. > > CODE DESC > A100 AAAAAA0 PRODUCTION - STOP 4/06/2006 > A101 AAAAAA1 NO LONGER PRODUCED 04/7/06 > A102 AAAAAA2 STOP - 4-7-06 > A103 AAAAAA3 04-7-06 PRODUCTION STOP > > This is a simplified version, AAAAAA can be any string (a product > name). > > the date can basically appear in any format and anywhere!! after > anything... the only reference point I have is the "/" or "-" ... > > So after looking to this mess, I want to do two things. 1) Clean it; 2) > Kill the person who did it, after a long torture!!!! > > Thanks in advance.


Back to: Top of message | Previous page | Main SAS-L page