LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2005, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 27 Jun 2005 23:59:44 +0000
Reply-To:     toby dunn <tobydunn@HOTMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         toby dunn <tobydunn@HOTMAIL.COM>
Subject:      Re: Optimization question
Comments: To: mjm33@MSM1.BYU.EDU
In-Reply-To:  <200506272247.j5RMl1xg024859@listserv.cc.uga.edu>
Content-Type: text/plain; format=flowed

Mike,

Proc format ; value Subset 1 = '1' 2 = '2' 3 = '3' 4 = '4' 5 = '5' Other = 'Other' ; run ;

data huge ; do permno = 1 to 100 ; output ; end ; run ;

Data Subset ; set huge ; where (put(permno,Subset.) ne 'Other' ) ; run ;

proc print ; run ;

Toby Dunn

From: Michael Murff <mjm33@MSM1.BYU.EDU> Reply-To: Michael Murff <mjm33@MSM1.BYU.EDU> To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Optimization question Date: Mon, 27 Jun 2005 16:46:47 -0600 Several persons of great wisdom have suggested that a format might be profitably used for table lookup in huge. Would someone be so kind as to demonstrate this technique for this situation? I think I could create a custom format on my list of 2k ids, but it's very unclear to me how this can be used when the format in Huge is different. Wouldn't the format have to be applied after the preprocessing where statement in the dstep?

Thanks all,

MM

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Dennis Diskin Sent: Monday, June 27, 2005 12:52 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Optimization question

Roy, I would personnaly prefer to put a long list of selections into a format. Formats are pretty well optimized and can be used in where statements. The BY statement and associated first.s will cetainly add some processing time and at least in the code you've shown, accomplishes nothing since var1_i and var2_i are not kept in the dataset. HTH, Dennis Diskin

On 6/27/05, Pardee, Roy <pardee.r@ghc.org> wrote: > > I eagerly await the responses to this question--I've got a very similar > set of sql queries I'd like to optimize. > > In the meantime--is your dset indexed & if not, is creating one > feasible? If so, what do you have indexed? > > -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > Michael Murff > Sent: Monday, June 27, 2005 11:25 AM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Optimization question > > > Hi SAS-L, > > > > I'm accessing a very large dataset (6 gigs) with the following code: > > > > data subset; > > set huge(keep=date id var1-var5); > > where "01Jan1970"d <= date <= "31DEC2003"d; > > by id date; > > year=year(date); > > if permno in(&list); > > if first.date then > > do; > > var1_l = lag(var1); > > var2_l = lag(var2); > > end; > > if first.id <http://first.id> then > > do; > > var1_l = .; > > var2_l = .; > > end; > > run; > > > > &list contains a list of 2000 ids (sorted) that I care about. Each id > will have a daily entry between the given dates. Huge dataset is already > sorted by ID and DATE. I need a more efficient way to run this datastep > as it takes several hours on our server. I have access to 8.2 and 9.1.3 > SAS versions in Unix environments. > > > > I tried putting &list in a compound where statement but I reach the 8.2 > where byte limit discussed recently on the -l (haven't tried this on > 9.1.3 yet). Does the by statement slow this down? And what about the > subsetting if statement. The final dataset "subset" should a few hundred > MBs. I can write a gig with our SCSI drives in about 15 minutes? so it > seems like this little dstep could be written to go faster. > > > > Thanks, > > > > Michael Murff > > Provo, UT >


Back to: Top of message | Previous page | Main SAS-L page