LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2006, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 8 Dec 2006 09:16:54 -0500
Reply-To:     "data _null_;" <datanull@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "data _null_;" <datanull@GMAIL.COM>
Subject:      Re: finding a string
Comments: To: alves <alves.paulo@gmail.com>
In-Reply-To:  <1165574706.266949.219190@16g2000cwy.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

While I like the ideas already proposed to fish for the date SAS does provide ways to actually locate strings of this type. I used the old RX functions because I'm using V8.2 and have the documentation handy.

data work.parts; infile cards eof=eof; rx = rxparse("$d[$d] $'-/' $d[$d] $'-/' $d$d[$d$d]"); do while(1); input CODE:$4. DESC&$40.; s=0; l=0; call rxsubstr(rx,desc,s,l); if s then date = input(substr(desc,s,l),ddmmyy10.); output; end; return; eof: call rxfree(rx); stop; format date ddmmyy10.; cards; A100 AAAAAA0 PRODUCTION - STOP 4/06/2006 A101 AAAAAA1 NO LONGER PRODUCED 04/7/06 A102 AAAAAA2 STOP - 4-7-06 A103 AAAAAA3 04-7-06 PRODUCTION STOP ;;;; run; proc print; run;

On 12/8/06, alves <alves.paulo@gmail.com> wrote: > Hi everyone. > > Quick question. > > I have a dataser of approx. 10.000.000 observations that has 2 > variables. CODE and DESCRIPTION. do not know why, but the person who > created this dataset instead of creating a new variable to extra > information, just added it to the DESCRIPTION. I have a list of the > most common strings that were added and I manage to filter them out. My > problem is with dates. An example. > > CODE DESC > A100 AAAAAA0 PRODUCTION - STOP 4/06/2006 > A101 AAAAAA1 NO LONGER PRODUCED 04/7/06 > A102 AAAAAA2 STOP - 4-7-06 > A103 AAAAAA3 04-7-06 PRODUCTION STOP > > This is a simplified version, AAAAAA can be any string (a product > name). > > the date can basically appear in any format and anywhere!! after > anything... the only reference point I have is the "/" or "-" ... > > So after looking to this mess, I want to do two things. 1) Clean it; 2) > Kill the person who did it, after a long torture!!!! > > Thanks in advance. >


Back to: Top of message | Previous page | Main SAS-L page