Date: Fri, 8 Dec 2006 12:23:46 -0500
Reply-To: Venky Chakravarthy <swovcc@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Venky Chakravarthy <swovcc@HOTMAIL.COM>
Subject: Re: finding a string
On Fri, 8 Dec 2006 02:45:06 -0800, alves <alves.paulo@GMAIL.COM> wrote:
>Hi everyone.
>
>Quick question.
>
>I have a dataser of approx. 10.000.000 observations that has 2
>variables. CODE and DESCRIPTION. do not know why, but the person who
>created this dataset instead of creating a new variable to extra
>information, just added it to the DESCRIPTION. I have a list of the
>most common strings that were added and I manage to filter them out. My
>problem is with dates. An example.
>
>CODE DESC
>A100 AAAAAA0 PRODUCTION - STOP 4/06/2006
>A101 AAAAAA1 NO LONGER PRODUCED 04/7/06
>A102 AAAAAA2 STOP - 4-7-06
>A103 AAAAAA3 04-7-06 PRODUCTION STOP
>
>This is a simplified version, AAAAAA can be any string (a product
>name).
>
>the date can basically appear in any format and anywhere!! after
>anything... the only reference point I have is the "/" or "-" ...
>
>So after looking to this mess, I want to do two things. 1) Clean it; 2)
>Kill the person who did it, after a long torture!!!!
>
>Thanks in advance.
I recommend (1) over (2). Some clarification required on what you mean by
clean up. How would you like the Description field to be cleaned? Should it
read as:
DESCRIPTION1 DESCRIPTION2 DESCRIPTION3
AAAAAA0 PRODUCTION - STOP 4/06/2006
AAAAAA3 PRODUCTION STOP 04-7-06
I suspect that the PRX functions would be your best bet and for that
solution I readily volunteer David :-) and Evil Petting Zoo :-). If you
clarify soon, I may even make my own crude attempt.
Venky Chakravarthy
|