|Date: ||Mon, 25 Aug 2008 17:07:21 -0400|
|Reply-To: ||Sigurd Hermansen <SigurdHermansen@WESTAT.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||Sigurd Hermansen <SigurdHermansen@WESTAT.COM>|
|Subject: ||Re: Join Order question|
|Content-Type: ||text/plain; charset="us-ascii"|
An ordinary mortal obviously wrote the note to which you replied. I note with regret a spelling mistake that I can't blame on spellcheck.
I haven't found documentation of the SQL standards particularly useful since every vendor departs in one way or another from one or more of the standards. Just think of relations that SQL combines into a relation as sets, and the set of names of attributes as, well, another set. Elements in sets have no ordering property other than collating sequences of values. SQL solutions don't depend on ordering properties of physical storage devices or naming conventions. Forcing data architects to embed order values in data turns out to make SQL (and other set-logic) solutions simpler and more robust.
While I wouldn't have stuck a "natural join" with the extended meaning of the word "natural", I prefer conventions that appear at least to have a purpose to those that appear arbitrary. In this case I see no logic in the observed ordering of join results, whether order of tuples or order of attributes. I'd prefer in fact to see the SQL compiler vary ordering of tuples and attributes in yields of queries simply to alert programmers to the fact that they shouldn't count on what appears to be a consistent pattern.
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Jack Hamilton
Sent: Monday, August 25, 2008 1:52 PM
Subject: Re: Join Order question
On Mon, 25 Aug 2008 13:07:24 -0400, "Sigurd Hermansen" <SigurdHermansen@WESTAT.COM> said:
> Speaking for the original SAS SQLHead, I'd say that the SAS System's
> implementation of the natural join behaves the way it behaves. SQL
> does not specify ordering of attributes by default. A SELECT clause
> may specify a projection of attributes, but, if left undefined in the
> program, the SQL compiler can chose whatever order it finds
> convenient. The SQL compiler doesn't even have to be consistent in one
> situation versus another. I am sure that you know that, but I'll state
> it for the record.
I suppose I should just break down and buy a copy of the SQL standard. I wonder if it can be understood by normal mortals.
Interestingly, the Oracle documentation says that the column order of "SELECT table.*" will be the same as that in the underlying table, but there is no corresponding claim for "SELECT *" with no table specified.
> As for why SAS appears to use one order consistently for INNER JOIN's
> and another for (INNER) NATURAL JOIN's, it seems natural to me that
> the key attributes would appear first in an unspecified projection.
> Putting the RHS non-key attributes first doesn't appear to have any
> purpose. Perhaps it's somehow related to the equally arbitrary
> behavior of a SAS Data step MERGE that overwrites LHS variables with
> same name as RHS variables.
After we become accustomed to an arbitrary behavior, it feels natural.
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
> Sent: Sunday, August 24, 2008 8:46 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Join Order question
> Here's one for the SQL heads - does anyone know why a natural join
> reverses the data set order that the variables are entered by into the
> data a;
> do i=1 to 10;
> do j=1 to 10;
> end; end;
> data b;
> do j=1 to 10;
> do k=1 to 10;
> end; end;
> proc sql;
> create TABLE innerjoin as
> select * from a, b where a.j=b.j;
> create TABLE naturaljoin as
> select * from a natural join b;
> Table innerjoin - I J K
> Table naturaljoin - K J I
> tia - Paul
firstname.lastname@example.org <== Use this, not email@example.com