LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 1998, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 16 Dec 1998 21:39:06 +0000
Reply-To:     Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Subject:      Re: XMAS SASTip: Quick Table Lookup by Hashing
In-Reply-To:  <913776318.1123129.0@vm121.akh-wien.ac.at>

In article <913776318.1123129.0@vm121.akh-wien.ac.at>, pdorfma@FL6612MAI LEX4.UCS.ATT.COM writes >Dear SAS-Lers, > >It is about time to exchange SAS gifts. Here is mine to you. > >Consider the following (toooo very common) problem: > >One file, SMALL, contains a variable SKEY. Another file, LARGE, contains a >variable LKEY and maybe other variables, for instance, SMTHELSE. Within the >limits of SAS, what is the most efficient way to match SMALL and LARGE by >SKEY and LKEY? > >For certainty, assume that the number of records in LARGE is &N_LARGE = >10,000,000 and that the number of records in SMALL, &N_SMALL, may vary from >1,000 to 2,000,000. Let us also limit the case to KEYs being non-negative

(snip)

> >Conclusion: A fifty or so lines of DATA step code seems like a pretty cheap >price for being able to subset 10 million records by 2 million in about a >minute, in all. > > >Happy Holiday, everyone! > >Kind regards, >Paul > >++++++++++++++++++++++++++++++++ >Paul M. Dorfman >Citibank Universal Card Services >Decision Support Systems >Jacksonville, FL >++++++++++++++++++++++++++++++++

Thanks Paul, that's a serious piece of work.

Interesting to see your approach, to avoid naming clashes with a random name generator. Others have suggested this may be a weakness. In this area, would you consider imposing a standard approach which tries to avoid the "sod's law" risk...... "if something can go wrong, it will, and at the worst possible time....."

alternative design for global unique naming - to avoid "sod's law" risk Generate a global macro variable to act as the pool counter providing the next free number in global name space.

rules for using the global name pool counter macro variable

rules: 1 if the name doesn't exist already, create it as a global with value 2, and use 1 as your returned value 2 if the name exists and is numeric, return that value & add 1 to the pool counter 3 if the name exists and is not numeric then use the next fall- back substitute and apply through rules 1 and 2 4 rules to limit fall-back substitutes ==> shoot the cause

How would you name the "global name pool counter macro variable" ?

seasonal greetings -- Peter Crawford


Back to: Top of message | Previous page | Main SAS-L page