LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2005, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 31 May 2005 07:12:03 -0700
Reply-To:     chris <fast_rabbit@GMX.CH>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         chris <fast_rabbit@GMX.CH>
Organization: http://groups.google.com
Subject:      Re: documentation of Perl regex directly in the code
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset="iso-8859-1"

Richard A. DeVenezia wrote:

> Rune: > > One perlish way to comment uses trailing #mycomment and the /x option. This > does not work in SAS. Your best bet is probably to use CATS function with > SAS comments. Note: CATS trims spaces, so blending in a space only can be > frustrating. Use \x20 to match a space when build a regex pattern using > CATS. > > In this sample I cheat a little. I use a capture group (\x20) to store a > space and then use it in the replacement. This was only done because CATS > trims out leading and trailing spaces and the original replacement part > would have been reduced to $2$1 without a space. > > data namechange; > set staff; > re = prxparse ( > CATS > ('s' /* Perform a substitution */ > ,'/' /* Start regular expression*/ > ,'(' /* Start capture buffer # 1 to store the last name*/ > ,'[^,]+' /* match one or more non-comma characters */ > ,')' /* end capture buffer # 1 */ > ,',' /* match a comma*/ > ,'(\x20)' /* match a space*/ > ,'(' /* start capture buffer # 2 to store the first name*/ > ,'\w+' /* match a word character one or more times */ > ,'(' /* Start capture buffer # 3. It is part of buffer # 2*/ > ,'\s+' /* match a white space*/ > ,'\w+' /* match a word character one or more times*/ > ,')' /* end capture buffer # 3, hold first name and middle name*/ > ,'?' /* match zero or one time*/ > ,')' /* end capture buffer # 2*/ > ,'/' /* end regular expression and start replacement text*/ > ,'$3' /* insert captured buffer # 2*/ > ,'$2' /* insert a space*/ > ,'$1' /* insert capture buffer # 1 */ > ,'/' /* end replacement text*/ > ) > ); > NewName = prxchange(re,1,Name); > run; > > -- > Richard A. DeVenezia > http://www.devenezia.com/

This is a good way to do this, but does it really help readability? I would stick with the initial regex and document is seperately on a new line:

re = prxparse( 's/([^,]+), (\w+(\s+\w+)?) /$2 $1/'); /* regex to reverse a string up to a comma with one or two words following the comma ([^,]+) capture one or more non-comma character into $1 , match a blank and a comma (\w+(\s+\w+)?) capture one or two blocks of characters separated by blanks into $2 */

chris


Back to: Top of message | Previous page | Main SAS-L page