Date: Thu, 13 Dec 2001 20:56:08 GMT
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
Organization: ntl Cablemodem News Service
Subject: Re: Using Key= to perform a table lookup
Having found this thread rather late (3 weeks+ late) I was tempted to just
let it go, but since Ian makes such a good point and since some of the
responses got so complicated, I felt I just had to put my ore (or is it
Firstly, Ian's assertion that the "default" behaviour seems ridiculous,
well, why not? Whenever did the SAS developers start using sensible
defaults? I sometimes think that when they do come up with a sensible
default, it is more by accident than by design. :-)
Secondly, Key= is a facility which I tend to use quite heavily and it is
very powerful, if you understand it's foibles.
Scenario 1 - direct access to a single record (sorry row) in a lookup table.
If the data you want to access is more than a single variable (column),
(when, of course, you'd use Format) use Key= with the /UNIQUE option. That's
what it's for.
Scenario 2 - direct access to multiple records for the same key value for a
one to many join and the main file can be guaranteed unique (PROC SORT ...
NODUPKEY; should do it.) Use KEY= taking the default behaviour to loop
through all records until none are left.
Scenario 3 - now it's getting tricky because we want a many to many join and
yes, it happens in my work a lot and it's all down to my users' preferences.
"Users are OK, but would you let your daughter marry one?".
Anyway, we must now understand that each SET KEY= statement stands on it's
own, thus if I code two identical set key= statements one after the other,
they will both return the same keyed record. One statement is not affected
by the other. So how do I guarantee I start with the first record of a set
and read through to the last, well, quite simply by cheating and it works
Data Fred; /* Where would we be without Fred? */
OnFirst = 1; Drop OnFirst;
Do until(_error_); * (what's with _iorc_ anyway?);
If OnFirst > 0 then /* Force an error to start with */
do; /* Next time around same instruction */
/* will find first record in set */
StoredKey = KeyValue; drop StoredKey;
KeyValue = .; * or something not on the keyed file;
KeyValue = StoredKey;
Set KeyedFile Key=KeyValue;
If OnFirst > 0 then
_error_ = 0;
OnFirst = -1;
If OnFirst < 0 then
OnFirst = 0;
If not _error_ then
_error_ = 0;
The important thing is to create a pseudo SET KEY= ... /UNIQUE; statement by
setting the key value to something not in the file. (actually anything other
than the current key value will work as well). We ignore the data which
comes from this iteration and reset the key back to it's proper value and
now the first real read will always return the first row from a set and
subsequent reads will return additional rows, until all rows with that key
value have been read. Simple eh?
So to answer Ian's original question, yes, I can think of a scenario where
the "default" behaviour is useful, but it takes a bit of effort to make it
work as you want it to.
My most extreme use of this technique required ten keyed files with multiple
values and a number of different keys. The master file also potentially had
multiple values for each key.
On 28-Nov-2001, WHITLOI1@WESTAT.COM (Ian Whitlock) wrote:
> From: WHITLOI1@WESTAT.COM (Ian Whitlock)
> Newsgroups: comp.soft-sys.sas
> Subject: Re: Using Key= to perform a table lookup
> Date: 28 Nov 01 20:37:03 GMT
> Sender: saslmnt@LISTSERV.UGA.EDU
> MIME-Version: 1.0
> Content-Type: text/plain; charset="iso-8859-1"
> Reply-To: Ian Whitlock <WHITLOI1@WESTAT.COM>
> Lines: 59
> Xref: news5-gui.server.ntli.net comp.soft-sys.sas:76122
> X-Received-Date: Wed, 28 Nov 2001 20:49:36 GMT
> Subject: Using Key= to perform a table lookup
> Summary: Is the default a 10 year old design bug?
> Respondent: IanWhitlock@westat.com
> Alistair Gordon [Alistair_Gordon@STANDARDLIFE.COM] has fallen into a
> common SASTrap. He wants to look up FUND+SEDOL combinations from
> ALLSPAR multiple times in DEALSEDS. Here is his code.
> data spardate;
> set allspar(keep=fund sedol date);
> set dealseds key=dealkey;
> if _iorc_=%sysrc(_dsenom) then
> There are two mistakes. The simple one, misspelling of _ERROR_, is
> his mistake. The other, the failure to use the UNIQUE option of the
> SET statement, is more questionable and the one I would label SASTrap.
> As the documentation states, by default, when adjacent key values are
> encountered SAS will continue looking for more records with the same
> key. In Alistair's case there aren't any and thus his problem.
> Note the funny situation here. If adjacent records have the same key
> SAS goes onto to look for a second record with that key. If they are
> not adjacent, SAS starts over with the first one. This means the
> result of the lookup must depend on the physical order of the find
> file, ALLSPAR in this case. Moreover, Suppose Alistair actually wanted
> this feature, so that the second record with the key should match the
> second record with the key in the lookup file, DEALSEDS. So what is
> the second one? Now we find the lookup result also depends on the
> physical order of the lookup file. Thus for consistency the order of
> both files must be predetermined in order to use this KEY tool. But
> the whole idea of direct access is to make free one from the physical
> Add to this that you must have enough records of the same key value in
> the lookup file to not run out. This combination of ideas, scared me
> to the point that I resolved to never to use the default, i.e. always
> use /UNIQUE. Thus, for me, the choice of defaults very poorly chosen.
> My question goes further. Has anyone ever found a realistic example
> where they dared to use the default in a meaningful manner? If so, I
> would like to see it discussed.
> Note that it is not fair to say you use it when it is known that there
> cannot be any duplicates. In this case the choice is irrelevant and
> have simply chosen not to write /UNIQUE. It is not a case where you
> actually need the default behavior.
> I have seen cases where one might consider the default option. Say I
> have a unique set of keys to lookup and I want to find all occurrences
> of these keys in the lookup file. Even here I would consider it more
> stable and therefore more correct code to sequentially number the
> original keys and make a new unique composite key.
> Ian Whitlock