Date: Thu, 30 Oct 2008 13:28:47 -0700
Reply-To: "Pardee, Roy" <pardee.r@GHC.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Pardee, Roy" <pardee.r@GHC.ORG>
Subject: Re: bug in proc sql?
In-Reply-To: <b7a7fa630810301257x2bec6ee5n70e3167079ba472f@mail.gmail.com>
Content-Type: text/plain; charset="us-ascii"
Give that man a cigar! According to a "little birdie":
Name in the inner query is a correlated reference to name in the outer query's
sashelp.class.
It is parsed and processed like
select *
from sashelp.class as outerclass
where outerclass.name in (select outerclass.name from class_girls)
;
The inner query runs for each row in the outerclass; tragically the list of
names returned by the innerquery always contains the very name you are running
it for... ie true by inspection, and the results are as if you had no where
clause at all (or an always-true where clause)
So it's not just returning unexpected results w/out warning, it's taking a really roundabout way to do it!
I wonder if there's a sql engine out there that would catch this query & issue a warning (at least). I'm having a hard time imagining a situation where you'd really want this to occur. Seems like a rule like "warn if a subselect only uses columns from the outer tables" would be useful.
Anyhoo--thanks for the time & attention.
Cheers,
-Roy
________________________________
From: Joe Matise [mailto:snoopy369@gmail.com]
Sent: Thursday, October 30, 2008 12:57 PM
To: Pardee, Roy
Cc: SAS-L@listserv.uga.edu
Subject: Re: bug in proc sql?
When I change the query to be correctly demarcated:
proc sql ;
create table class_girls as
select age
from sashelp.class
where sex = 'F'
;
create table test as
select *
from sashelp.class a
where a.name in (select b.name from class_girls b)
;
quit ;
it gives the expected error. name is not adequately defined, so when it finds no NAME on class_girls it moves back and sees name on sashelp.class...
-Joe
On Thu, Oct 30, 2008 at 2:42 PM, Pardee, Roy <pardee.r@ghc.org> wrote:
Holy! MSSQL does it too! And so does MySQL.
You only get the error I would expect if you give a full table_name.field_name spec in the subselect.
So--are others at least as *surprised* as I am that we don't get errors here?
I would like to think of the parens on the subselect as having something like the same effect they do in arithmetic expressions. But it seems not...
________________________________
From: Mary [mailto:mlhoward@avalon.net]
Sent: Thursday, October 30, 2008 12:32 PM
To: Pardee, Roy; SAS-L@LISTSERV.UGA.EDU
Subject: Re: bug in proc sql?
I can replicate it, but I can also replicate it in Oracle! Oracle returns all the boys too.
So I'm not sure if there is a standard to SQL that is missing here.
-Mary
----- Original Message -----
From: Pardee, Roy <mailto:pardee.r@GHC.ORG>
To: SAS-L@LISTSERV.UGA.EDU
Sent: Thursday, October 30, 2008 2:24 PM
Subject: bug in proc sql?
Running SAS 9.1.3 Service Pack 4 (TS1M3) on the NET_SRV platform, the second sql statement below returns the entire sashelp.class dataset, instead of an error about the subselect not being valid.
proc sql ;
* Create a table with just the ages of the girls. ;
create table class_girls as
select age
from sashelp.class
where sex = 'F'
;
* Why doesn't this complain about the lack of a name field in the class_girls dset? ;
* Instead it just returns everyone in sashelp.class. ;
select *
from sashelp.class
where name in (select name from class_girls)
;
quit ;
That's a bug, isn't it? Can others replicate?
Thanks!
-Roy
Roy Pardee
Research Analyst/Programmer
Group Health Center For Health Studies (Cancer Research Network)
(206) 287-2078
Google Talk: rpardee