LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2001, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 26 Sep 2001 10:02:59 -0400
Reply-To:     "Diskin, Dennis" <Dennis.Diskin@PHARMA.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Diskin, Dennis" <Dennis.Diskin@PHARMA.COM>
Subject:      Re: Speed of =: vs. SUBSTR()...
Comments: To: Laurel Copeland <lacop@UMICH.EDU>
Content-Type: text/plain

Laurel,

This is a though one to give a set answer to because the SQL process optimization keeps improving in each release. I think the only way you could answer it is to run a test for both approaches. Off hand, I'd say this looks about as good as you're going to get since the problem is pretty straightforward.

fwiw, Dennis Diskin

> -----Original Message----- > From: Laurel Copeland [SMTP:lacop@UMICH.EDU] > Sent: Wednesday, September 26, 2001 9:49 AM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Speed of =: vs. SUBSTR()... > > What would be fastest for a set of substring values? E.g., still with the > 5-character diagnostic codes, I want to pull off cases with any diagnosis > beginning with, say, 290, 295, 297, or 298. Running through 10 fields on > a tape containing a few million records, I could use (SAS 6.09, MVS): > > Proc SQL; > CREATE table work.REG88a > AS SELECT PM.SCRSSN, > PM.DXLSF,PM.DXF2,PM.DXF3,PM.DXF4,PM.DXF5, > PM.DXF6,PM.DXF7,PM.DXF8,PM.DXF9,PM.DXF10 > FROM pm88.pm88 AS pm > WHERE substr(PM.DXLSF,1,3) IN('290','295','297','298') > OR substr(PM.DXF2,1,3) IN('290','295','297','298') > OR substr(PM.DXF3,1,3) IN('290','295','297','298') > OR substr(PM.DXF4,1,3) IN('290','295','297','298') > OR substr(PM.DXF5,1,3) IN('290','295','297','298') > OR substr(PM.DXF6,1,3) IN('290','295','297','298') > OR substr(PM.DXF7,1,3) IN('290','295','297','298') > OR substr(PM.DXF8,1,3) IN('290','295','297','298') > OR substr(PM.DXF9,1,3) IN('290','295','297','298') > OR substr(PM.DXF10,1,3) IN('290','295','297','298'); > > Is this my best bet? > Laurel Copeland > > At 09:24 AM 9/26/01 -0400, kviel wrote: > > > Wim, > > I apologize. I would like to modify my previous suggestion. > If, > indeed, we are > interested in this effect: > > if substr(c(i),1,1)='8' then d=1; > > Then this should really interest us: > > 87 data _null_; > 88 z01='80000'; > 89 do i=1 to 1e7; > 90 if substr(z01,1,1)='8' then d=1; > 91 end; > 92 run; > > NOTE: DATA statement used: > real time 3.98 seconds > cpu time 3.98 seconds > > > 93 > 94 data _null_; > 95 z01='80000'; > 96 do i=1 to 1e7; > 97 if z01=:'8' then d=1; > 98 end; > 99 run; > > NOTE: DATA statement used: > real time 0.27 seconds > cpu time 0.27 seconds > > In other words, on a Windows NT machine running v8.1, the substr() > function > was not as fast as the colon. >


Back to: Top of message | Previous page | Main SAS-L page