Date: Fri, 22 Oct 2010 11:07:48 -0500
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: When does PROC SQL consistently leave incoming data order
Content-Type: text/plain; charset=ISO-8859-1
What you found is consistent with what I've generally observed. The warning
that it cannot be guaranteed is more of a 'if it really matters, use order
by, or you might not realize it's reordering things' warning.
On Fri, Oct 22, 2010 at 10:40 AM, Keintz, H. Mark <email@example.com
> The SAS 9.2 SQL Procedure User's Guide (page 24) states:
> "Unless an ORDER BY clause is included in the SELECT statement,
> then a particular order to the output rows, such as the order
> in which the rows are encountered in the queried table, cannot
> be guaranteed, even if an index is present."
> Yet we have found that SAS DOES CONSISTENTLY PRESERVE QUERIED
> TABLE ORDER for this very simple SELECT example, under certain
> conditions. We generated dataset WANT as:
> PROC SQL noprint;
> create table WANT as
> select * from HAVE
> where qvar in ( list of values ) ;
> Then we compared WANT to dataset CONTROL, defined as:
> data CONTROL;
> set HAVE;
> if qvar in ( list of values ) ;
> (Yes, we used IF intentionally). The conditions in which we
> found WANT to always match CONTROL (i.e. no data reordering)
> 1. HAVE is not indexed or SAS chooses not to use an index.
> After dozens of tests on several datasets with (in the
> aggregate) over 2 billion records, we have not produced a
> single instance of re-ordering.
> 2. SAS does use an index, (i.e. HAVE is indexed by QVAR), but
> HAVE is also sorted by QVAR. I.e. the index variable is
> also the primary sort key.
> My question is:
> Have any of you found PROC SQL to re-order records in either
> condition 1 or 2 above?
> I should add that
> (a) we are using a LINUX operating system,
> (b) the THREADS option is enabled,
> (c) CPUCOUNT is 4,
> (d) the queried tables are SAS version 9 datasets, and
> (e) our data is not stored using the SPDE engine (although
> it is on a RAID array).
> BTW, it is possible that the ORDER BY will reorder the data,
> even if it specifies the same sort keys as the queried table.
> Specifically, if HAVE is sorted by QVAR X Y, but has records
> with duplicate keys, then using ORDER BY QVAR,X,Y can re-order
> the tied records (it did so in 20 out of 20 tests we performed).