Date: Tue, 23 Nov 2004 12:34:46 -0500
Reply-To: Venky Chakravarthy <venky.chakravarthy@PFIZER.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Venky Chakravarthy <venky.chakravarthy@PFIZER.COM>
Subject: Re: SAS Function
On Tue, 23 Nov 2004 10:29:38 -0500, Bosch, Jules <jules.bosch@SPCORP.COM>
wrote:
>SAS V8
>
>I have no documentation at the moment but need to read a string seeking
date
>occurrences (format=ddmmmyy or ddmmmyyyy). Often there is more than one
>date in the string. I think the INDEXC function should work but don't know
>if it will locate each occurrence of an excerpt or just the first. Any
>suggestions would be greatly appreciated?
>
>TIA,
>
>Jules Bosch
>
>
>*********************************************************************
>This message and any attachments are solely for the intended recipient. If
you are not the intended recipient, disclosure, copying, use or
distribution of the information included in this message is prohibited --
Please immediately and permanently delete.
Hi Jules,
If you don't already know it, the online doc is available for free from the
SAS site.
Your problem as stated is not as simple as it first appears to be. Advanced
knowledge of the range of dates can make this problem easier to solve.
Another useful prior would be the maximum number of date values in a single
string.
One of the difficulties that I see is that you may have have a string such
as "01jan2000" in date9 format which is equivalent to "01jan00" in the
date7 format. However, a part of the first string "01jan20" is a legitimate
string in the date7 format and might be double counted. So this needs to be
accounted for in any solution.
Here is a brute force approach that limits the date range search from
01Jan1960 through 19Feb2042. These dates can be easily adjusted. I think
the solution can be speeded up with some more thought, but this should get
you started. I have not bothered to drop the junk variables.
data test ;
array read ( 10 ) $9 ;
length yy $7 yyyy $9 ;
input string1 $1-35 ;
string2 = string1 ;
found = 0 ;
do i = "01jan1960"d to "19feb2042"d ;
yy = put ( i , date7. ) ;
yyyy = put ( i , date9. ) ;
c1 = index(upcase(string2),yyyy) ;
check1 = c1>0 ;
if check1 then do ;
found = sum(found,check1) ;
read(found) = yyyy ;
substr(string2,c1,9)="" ;
end ;
c2 = index(upcase(string2),yy)>0 ;
check2 = c2>0 ;
if check2 then do ;
found = sum(found,check2) ;
read(found) = yy ;
substr(string2,c2,7)="" ;
end ;
end ;
cards ;
blah01JAN2000blahblah01jan00
01dec2001helloKitty
je29feb2000ciwfj28feb2000
bl01JAN2000bla21mar1968hblah01jan00
;