LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2004, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 11 May 2004 19:17:47 +0100
Reply-To:     peter.crawford@BLUEYONDER.CO.UK
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Crawford2 <peter.crawford@BLUEYONDER.CO.UK>
Subject:      Re: Can rxparse be useful for this address cleaning work??
Comments: cc: jfh@dcn.org
Content-Type: text/plain; charset="utf-8"

It seemed too good a chance to follow up... here in Montreal - at the Futures Forum Perhaps there might be some interest in an informat to support regular expressions ...

Meanwhile, if you use, or think you would use regular expressions (if simpler), it might be worthn forwarding the idea to suggest@sas.com indicating the kind of business benefit you see in supporting regular expression informats

model ( non-functional.... yet)

proc format ; invalue $inpicture '<complex regular expression string'(regExp) = _same_ ; run; data discovery; length my_data $30. ; infile '<loads of text data file>' ls=32000 ; input my_data $inpicture32000. ; run;

Regards Peter Crawford

-----Original Message----- From: Jack Hamilton [mailto:jfh@dcn.org] Sent: Tue 5/11/2004 2:30 AM To: peter.crawford@blueyonder.co.uk Cc: Subject: RE: [SAS-L] Can rxparse be useful for this address cleaning work??

I had asked Rick Langston for regex capabilities in formats at the last SUGI, and he told me today that he was stilling thinking about it (maybe has some code written, but not released). So yes, I think it's worth escalating.

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU]On Behalf Of Peter Crawford2 Sent: Monday, May 10, 2004 8:39 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: [SAS-L] Can rxparse be useful for this address cleaning work??

If we could use a regular expression as an informat - almost like an inpicture, it might simplify the implementation of this solution ! Does anyone think it is worth escalating ?

Would regular expression informats make regular expressions easier to use ?

HTH Peter Crawford

On Mon, 8 Mar 2004 22:34:31 -0500, Richard A. DeVenezia <radevenz@IX.NETCOM.COM> wrote:

>Duck-Hye Yang wrote: >> Hi, >> My address data look like this: >> data one; >> length line $50; >> line = "0S 810 SPRING GREEN"; output; >> line = "0S0 42 PEARL ROAD"; output; >> line = "0 S 336 EAST STREET"; output; >> line = "0 SOUTH 531 JEFFERSON"; output; >> line = "0 S 356 MADISON"; output; >> line = "1 S 356 MADISON"; output; >> line = "1 NORTH 356 MADISON"; output; >> run; >> >> My goal is to first combine the three or two components into one >> component so that the desired output is like the following: >> "0S810 SPRING GREEN" >> "0S042 PEARL ROAD" >> "0S336 EAST STREET" >> "0SOUTH531 JEFFERSON" >> "0S356 MADISON" >> "1S356 MADISON" >> "1NORTH356 MADISON" >> >> For the last 4 days, I have been trying to do this daunting work using >> rxparse function as shown by Chang Y. Chung. >> >> I gave up finally. Can anybody help me with this? >> >> Thanks >> Duckhye > >I haven't followed the thread, but the output appears to indicate you want >to >- remove all spaces prior to last digit encountered > >This SAS regular expression does that (well almost, it retains all >characters A-z0-9 prior to last digit found) : > ><sasl:code> >data one; > length line $50; > line = "0S 810 SPRING GREEN"; output; > line = "0S0 42 PEARL ROAD"; output; > line = "0 S 336 EAST STREET"; output; > line = "0 SOUTH 531 JEFFERSON"; output; > line = "0 S 356 MADISON"; output; > line = "1 S 356 MADISON"; output; > line = "1 NORTH 356 MADISON"; output; >run; > >* retain only letters and digits upto and including last digit found; > >data foo; > set one; > if _n_ = 1 then do; > retain rx; > rx = rxparse (" ~'A-z0-9'*<$'A-z0-9'>*~'A-z0-9'*<$D> to =1=2 "); > * shorter slightly different alternative; > * rx = rxparse (" $W*<$C>*$W*<$D> to =1=2 "); > put rx=; > end; > > length scrunch $50; > > call rxchange (rx,99,line,scrunch); > > put @1 line= @30 scrunch=; >run; ></sasl:code> > >-- >Richard A. DeVenezia >http://www.devenezia.com/downloads/sas/samples


Back to: Top of message | Previous page | Main SAS-L page