LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2005, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 6 Jul 2005 16:17:40 -0400
Reply-To:     "Dorfman, Paul" <paul.dorfman@FCSO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Dorfman, Paul" <paul.dorfman@FCSO.COM>
Subject:      Re: delete duplicate string within a single variable
Comments: To: Sigurd Hermansen <HERMANS1@WESTAT.COM>

Generalissimo Sig,

In my tests, not quite really... even when I have eliminated the extraneous put(i,1.), and after killing the I/O (_null_) (it is much easier to measure hairs cut from an elephant's posterior in a lab than right whence they have come):

data _null_ ; input a: $char20. ; x = a ; do j = 1 to 1e6 ; x = '' ; do c = '0','1','2','3','4','5','6','7','8','9'; if indexc (a, c) then x = trimn (x) || c ; end; end; cards; 000000000 000110102 420333000 9800019800001987 9a8a7a6vvvvvvv0000 ; run; NOTE: DATA statement used (Total process time): real time 18.23 seconds cpu time 18.15 seconds

data _null_ ; input a: $char20. ; do j= 1 to 1e6 ; x = compress ('0123456789', translate ('0123456789', '', a)) ; end; cards; 000000000 000110102 420333000 9800019800001987 9a8a7a6vvvvvvv0000 ; run ; NOTE: DATA statement used (Total process time): real time 5.89 seconds cpu time 5.89 seconds

That's on a XP desktop running 9.1.3. Under AIX (same SAS), the ratio was 23:5.

Kind regards ---------------- Paul M. Dorfman Jacksonville, FL ----------------

On Wed, 6 Jul 2005 15:08:46 -0400, Sigurd Hermansen <HERMANS1@WESTAT.COM> wrote:

>Not so fast, Marshals Dorfman and Schreier.... > >A variant of Richard of Venice's solution combined with Toby of Texas' >string list, > >data test1 (keep=a x); > retain _digits '0123456789'; > input a: $char20. ; > do i=1 to 1000000; > x = compress(_digits,translate(_digits,' ',a)); > output; > end; >cards; >000000000 >000110102 >420333000 >9800019800001987 >9a8a7a6vvvvvvv0000 >; >run; >data test2 (keep=a x); > length x $ 10; > input a: $char20. ; > do j=1 to 1000000; > do i = '0','1','2','3','4','5','6','7','8','9'; > if index(a,put(i,1.)) then x=trim(x)||i; > end; > output; > x=''; > end; >cards; >000000000 >000110102 >420333000 >9800019800001987 >9a8a7a6vvvvvvv0000 >; >run; > > >competes in my tests on even terms with Guido's clever use of string >functions. It also transports fairly easily to other programming >environments as well as to other computing platforms. >Sig > > >-----Original Message----- >From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] >On Behalf Of Howard Schreier <hs AT dc-sug DOT org> >Sent: Wednesday, July 06, 2005 12:21 PM >To: SAS-L@LISTSERV.UGA.EDU >Subject: Re: delete duplicate string within a single variable > > >No loop, no wallpaper, works in V. 8. > >I think we have a winner here. > >The string of all digits does not have to be stored in a variable, so it >can even be adapted for SQL, thusly: > > select compress('0123456789',translate('0123456789',' ',a)) as x > from xx; > >On Wed, 6 Jul 2005 07:54:52 +0000, Guido T <cymraeg_erict@HOTMAIL.COM> >wrote: > >>I was a bit too quick with my initial "solution". The middle compress >isn't >>needed, compressing out extra spaces isn't a problem. Also TRANSLATE >>only needs a single space to translate to. How many years have I been >>using TRANSLATE function? Sigh... >> >>What it does is compress out of the _DIGITS string (containing the >>valid character, in the correct order) the characters from *A* that >>aren't in the _DIGITS string. >> >>++ Guido >> >>295 data test(drop=_digits); >>296 set xx; >>297 retain _digits '0123456789'; >>298 x = compress(_digits,translate(_digits,' ',a)); >>299 put a= x=; >>300 run; >> >>a=000000000 x=0 >>a=000110102 x=012 >>a=420333000 x=0234 >>a=9800019800001987 x=01789 >>a=9a8a7a6vvvvvvv0000 x=06789 >> >>NOTE: There were 5 observations read from the data set WORK.XX. >>NOTE: The data set WORK.TEST has 5 observations and 3 variables. >>NOTE: DATA statement used: >> real time 0.01 seconds >> cpu time 0.01 seconds


Back to: Top of message | Previous page | Main SAS-L page