LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2006, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 10 Jan 2006 11:46:41 -0800
Reply-To:   David L Cassell <davidlcassell@MSN.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   David L Cassell <davidlcassell@MSN.COM>
Subject:   Re: Hash Table Memory Limitations
In-Reply-To:   <200601101450.k0ABkDgS009578@mailgw.cc.uga.edu>
Content-Type:   text/plain; format=flowed

sezzy@IHCIS.COM wrote back: >Thank you everyone for your comments on this. I need to apologize for >the inadequacy of my example to model the real-life application. The >example is deficient in two aspects: > >1. I didn't make the size of the data table (med_test1) nearly large >enough in my example. The real data set has about 5,000,000 keys to >look up, not 5,000.

Perhaps you could explain the full size of your problem. If you have 5M keys to look up, then how big is your base data set? If you have a trillion-record base dta set and 5 million keys to look up, then you may benefit from re-designing your process. For now, if this is more your problem, then try something simple, like split-and-combine.

Split your 'keys' data set into 5 pieces, each of a million keys, assuming your system will handle this. Run the code you have already designed, assigning keys and satellite data to the huge data set each time, until you have assigned all 5M keys using hashes. Now you have 5 data steps instead of one, but no more hideous crashes.

>2. Although the lookup table is sorted for this lookup, we actually >have two more look-ups to do with the data set (with fields other than >member & dtsc_cd); in order to use Paul's technique, these lookups >would require the data set to be sorted two additional times. This >might be the way to go, depending on how long it would take to do the >sorting and look up 5M keys. > >The way the program is written, it would be difficult to switch the >tables, but, we are considering scaling-down the hash table.

Well, try the above idea and see if it saves you any grief. If not, then throw it away.

>FYI, I just learned from the folks at SAS tech support that "... when a >large amount of memory is requested by the hash object, the hash object >structure which requests the memory can be overloaded. This cause[s] >the premature out of memory error to occur." They said that there will >be a fix in SAS 9.1.3 service pack 4 (should be available in March) >which prevents the overload. The fix will include a new error message >which tells how many keys were loaded into the hash table before the >out of memory condition occured.

Great. Then I can know how big a hash I got before the hideous crash.

No, actually that could be helpful. As in my above weird suggestion, you would know how big the pieces of your lookup table could be and still maintain a (close to) stable system.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Don’t just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/


Back to: Top of message | Previous page | Main SAS-L page