LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2002, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 21 Oct 2002 16:46:58 -0400
Reply-To:     Howard_Schreier@ITA.DOC.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Howard_Schreier@ITA.DOC.GOV
Subject:      Re: "safe" merging?

I think the distinction Brad makes is reasonable, but that the observed behavior may be a matter of software engineering.

Here's a minimal illustration. Start with two datasets:

data a; retain key 1; run;

data b; retain key 1; run;

Now join them:

proc sql; select * from a, b where a.key=b.key; quit;

There is no warning message. The SELECT is perfectly happy to present two columns with the same name.

Now create an output dataset instead of a listing:

proc sql; create table c as select * from a, b where a.key=b.key; quit;

The log includes the following

WARNING: Variable key already exists on file WORK.C.

So it's the CREATE TABLE which objects to the name collision.

Brad suggests that it should be smart enough to see that the two columns named KEY were invoked in an equality condition in an inner join, and thus must have identical values, so that losing one is not a cause for concern.

My theory is that the code which does the CREATE TABLE business cannot see that deeply into the details of the SELECT.

On Mon, 21 Oct 2002 14:13:12 -0400, Goldman, Brad (AT-Atlanta) <Brad.Goldman@AUTOTRADER.COM> wrote:

>> From: Ian Whitlock [mailto:WHITLOI1@WESTAT.com] >> Brad, >> I suspect that the real problem is the weak way SAS handles >> error messaging. >> If you ask for >> >> select a.* , b.* >> >> This means you want all of the variables from A and B. >> >> Thus it seems reasonable to inform you that your request will not be >> fulfilled. On the other hand, WHEN YOU KNOW WHAT YOU ARE >> DOING, it would be >> nice to have a method of saying to SAS, "I am in control and >> everything is >> fine, don't report your little problem", or more generally "I >> am in control >> and I think the following action should be taken". In this >> case you happen >> to know information that SAS cannot presently know. Hence it >> is impossible >> to be right by insisting on the message or by insisting it >> not be there. > >Ian, in your example, I think it is appropriate to have the warning. >However, there is a difference in our examples that is detectable by SAS >without resorting to telepathy. In my example, the variable being warned >about is in the where statement, "where a.key=b.key". I imagine someone >will dream up some counterexamples, but to me it seems that if the variable >is being used on both sides of an equals sign in the where clause, then it >is OK and expected and fine that it exists on both datasets, and I can't see >how it could do any harm. If that condition is not true, then the warning >is appropriate. > >Brad


Back to: Top of message | Previous page | Main SAS-L page