Date: Mon, 10 May 2004 20:39:22 -0400
Reply-To: Peter Crawford2 <peter.crawford@BLUEYONDER.CO.UK>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford2 <peter.crawford@BLUEYONDER.CO.UK>
Subject: Re: Can rxparse be useful for this address cleaning work??
If we could use a regular expression as an informat - almost
like an inpicture, it might simplify the implementation of this
solution !
Does anyone think it is worth escalating ?
Would regular expression informats make regular expressions easier to use ?
HTH
Peter Crawford
On Mon, 8 Mar 2004 22:34:31 -0500, Richard A. DeVenezia
<radevenz@IX.NETCOM.COM> wrote:
>Duck-Hye Yang wrote:
>> Hi,
>> My address data look like this:
>> data one;
>> length line $50;
>> line = "0S 810 SPRING GREEN"; output;
>> line = "0S0 42 PEARL ROAD"; output;
>> line = "0 S 336 EAST STREET"; output;
>> line = "0 SOUTH 531 JEFFERSON"; output;
>> line = "0 S 356 MADISON"; output;
>> line = "1 S 356 MADISON"; output;
>> line = "1 NORTH 356 MADISON"; output;
>> run;
>>
>> My goal is to first combine the three or two components into one
>> component so that the desired output is like the following:
>> "0S810 SPRING GREEN"
>> "0S042 PEARL ROAD"
>> "0S336 EAST STREET"
>> "0SOUTH531 JEFFERSON"
>> "0S356 MADISON"
>> "1S356 MADISON"
>> "1NORTH356 MADISON"
>>
>> For the last 4 days, I have been trying to do this daunting work using
>> rxparse function as shown by Chang Y. Chung.
>>
>> I gave up finally. Can anybody help me with this?
>>
>> Thanks
>> Duckhye
>
>I haven't followed the thread, but the output appears to indicate you want
>to
>- remove all spaces prior to last digit encountered
>
>This SAS regular expression does that (well almost, it retains all
>characters A-z0-9 prior to last digit found) :
>
><sasl:code>
>data one;
> length line $50;
> line = "0S 810 SPRING GREEN"; output;
> line = "0S0 42 PEARL ROAD"; output;
> line = "0 S 336 EAST STREET"; output;
> line = "0 SOUTH 531 JEFFERSON"; output;
> line = "0 S 356 MADISON"; output;
> line = "1 S 356 MADISON"; output;
> line = "1 NORTH 356 MADISON"; output;
>run;
>
>* retain only letters and digits upto and including last digit found;
>
>data foo;
> set one;
> if _n_ = 1 then do;
> retain rx;
> rx = rxparse (" ~'A-z0-9'*<$'A-z0-9'>*~'A-z0-9'*<$D> to =1=2 ");
> * shorter slightly different alternative;
> * rx = rxparse (" $W*<$C>*$W*<$D> to =1=2 ");
> put rx=;
> end;
>
> length scrunch $50;
>
> call rxchange (rx,99,line,scrunch);
>
> put @1 line= @30 scrunch=;
>run;
></sasl:code>
>
>--
>Richard A. DeVenezia
>http://www.devenezia.com/downloads/sas/samples
|