```Date: Thu, 8 Jul 2010 16:29:46 +0000 Reply-To: toby dunn Sender: "SAS(r) Discussion" From: toby dunn Subject: Re: How to compress muliple decimal points Comments: To: mkeintz@wharton.upenn.edu In-Reply-To: Content-Type: text/plain; charset="iso-8859-1" Well at the moment I can't find away around the fact that Perl's RegEx lookbehind can't handle a variable length lookbehind. To that end one could use a hybrid approach and simply use a couple of reverses and a strip function call and keep only the first occurance of a '.'.. Text = Reverse( Strip( PrxChange( 's/((?=.+\.)(\.))//o' , -1 , Reverse(Text) ) ) ) ; Toby Dunn "Don't bail. The best gold is at the bottom of barrels of crap." Randy Pausch "Be prepared. Luck is where preparation meets opportunity." Randy Pausch > Date: Thu, 8 Jul 2010 13:27:02 +0000 > From: mkeintz@WHARTON.UPENN.EDU > Subject: Re: How to compress muliple decimal points > To: SAS-L@LISTSERV.UGA.EDU > > How about this approach, based on the following logic? > > 1. Find position of the first decimal point. > 2. Compress out all subsequent decimal points > (i.e. both trailing and internal) > > This requires no distinct treatment of trailing decimal points > versus excess internal decimal points. The "trick" here is to > to utilize the ability of the SUBSTR function (unique among > functions, no?) to be used on the left side of an assignment > statement. > > > data _null_; > input text \$; > put "Before " text= @; > ix=index(text,'.'); > if ix>0 then substr(text,ix+1)=compress(substr(text,ix+1),'.'); > put @22 "After " text= ; > datalines; > 100 > 100.2 > 100..3 > 100.4. > 100..5. > 100...6 > 100.7.0 > 100.8.0. > run; > > > Regards, > Mark > > > > -----Original Message----- > > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > > Arthur Tabachneck > > Sent: Wednesday, July 07, 2010 9:42 AM > > To: SAS-L@LISTSERV.UGA.EDU > > Subject: Re: How to compress muliple decimal points > > > > Dave, > > > > Tom's suggestion is definitely a lot less brute looking than the > > following, but doesn't correct for one of your examples. There is > > probably a much easier way, but the following will at least accomplish > > the > > task: > > > > data have; > > input (x y) (\$); > > got_dot=0; > > do i=1 to length(x); > > if substr(x,i,1) eq '.' then do; > > if not(got_dot) then do; > > x=substr(x,1); > > got_dot=1; > > end; > > else do; > > x=catt(substr(x,1,i-1),substr(x,i+1)); > > end; > > end; > > end; > > want=input(x,best12.); > > cards; > > 37..2 s/b > > 37.2 but > > 56.2. s/b > > 56.2 s/b > > ; > > > > Art > > ---------- > > On Wed, 7 Jul 2010 08:57:40 -0400, Dave Brewer > > wrote: > > > > >Tom, > > > > > >Thanks for your solution. I was thinking of TRANWRD, but for some dumb > > >reason, I didn't think I could put in two decimal points. > > > > > >Thanks again. > > >Dave > > > > > >On Wed, 7 Jul 2010 08:36:29 -0400, Tom Abernathy > > > > >wrote: > > > > > >>Dave - > > >> A simple thing is to use TRANWRD function. > > >> clean = tranwrd( old , '..' , '.' ); > > >> > > >> Another trick I have used to eliminate dups is to switch the > > character > > >>with the space character and use the COMPBL function, then switch > > back. > > >>This will handle 2, 3 or more characters in a row in one pass. > > >> > > >> new = translate( old , ' .' , '. ' ); > > >> new = compbl( new ); > > >> new = translate( new , ' .' , '. ' ); > > >> > > >>- Tom > > >> > > >>On Wed, 7 Jul 2010 07:07:23 -0400, Dave Brewer > > >wrote: > > >> > > >>>Hi All, > > >>> > > >>>I'm sure there is a simple answer to my problem, but I can't see the > > >>>forest through the trees! > > >>> > > >>>How do I replace multiple decimal points with one when the decimals > > are > > >>>back-to-back? Ex) 37..2 s/b 37.2 but 56.2. s/b 56.2 > > >>> > > >>>I am trying to clean up dirty data and keep as many values as > > possible > > >>>without making them missing. > > >>> > > >>>Thanks. > > >>>Dave _________________________________________________________________ The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5```

Back to: Top of message | Previous page | Main SAS-L page