LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2003, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 6 Aug 2003 03:16:00 +0000
Reply-To:     sashole@bellsouth.net
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject:      Re: Proc Format performance question
Comments: To: rich0850@AOL.COM
Content-Type: text/plain; format=flowed

Richard,

You hit it right on the head. However, why would you think that the number of ranges in the format (I assume this is what you meant by "members') would *not* affect performance?

Underlying format structure is an AVL tree, whose search performance deteriorates as O(log2(N)), so in your case, you should expect it to decrease log2(300000/20000) ~ 4 times. In reality, the search time difference will not be that stark but still appreciable.

(In)formats have actually *not* been originally designed to serve as huge lookup tables. It is the resourcefullness of a SAS programmer than has pushed formats to this extreme. Nor the AVL tree is the fastest lookup structure; I suspect it has been chosen because it scales very well and guarantees the average O(log2(N)) searching behavior *regardless* of the input key distribution. Hence the reasons for the advent of the hash object in the V9's Data step. If you are still running V8 and your performance is a paramount consideration, try hand-coded hash schemes. You can find plenty of them already SAS-implemented and tested in SUGI 26-28 papers by Gregg Snell and yours truly; the one from 26 gets the most nitty-gritty with the details.

Kind regards, ================== Paul M. Dorfman Jacksonville, FL ==================

>From: RICH0850 <rich0850@AOL.COM> > >Dear Group: > >Poor performance question. > >I've got a format library. It's got a single member "$ARRDECO". It used to >be >small (less than 20k entries). Now it's 300k plus entries. > >Now, referencing this format takes a lot of time (both CPU and real time). > >Such as BOB=PUT(JANE,$ARRDECO.); > >Prior to this it was fast, now it's slow. > >Is the number of members in a format a problem... or what? > >--Richard

_________________________________________________________________ Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail


Back to: Top of message | Previous page | Main SAS-L page