Date: Thu, 11 May 2006 15:26:43 -0400
Reply-To: Talbot Michael Katz <topkatz@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Talbot Michael Katz <topkatz@MSN.COM>
Subject: Re: Sorted-by option Catch 22
Hi Art.
That is an interesting gotcha. However, by my reading, it does match what
the SAS PROC SORT documentation says will happen, and the FORCE option
gives you redress:
"FORCE
sorts and replaces an indexed data set when the OUT= option is not
specified. Without the FORCE option, PROC SORT does not sort and replace
an indexed data set because sorting destroys user-created indexes for the
data set. When you specify FORCE, PROC SORT sorts and replaces the data
set and destroys all user-created indexes for the data set. Indexes that
were created or required by integrity constraints are preserved.
"Tip: PROC SORT checks for the sort information before it sorts a data set
so that data is not re-sorted unnecessarily. By default, PROC SORT does
not sort a data set if the sort information matches the requested sort.
You can use FORCE to override this behavior. You might need to use FORCE
if SAS cannot verify the sort specification in the data set option
SORTEDBY=. For more information about SORTEDBY=, see the chapter on SAS
data set options in SAS Language Reference: Dictionary.
"Restriction: If you use PROC SORT with the FORCE option on data sets that
were created with the Version 5 compatibility engine or with a sequential
engine such as a tape format engine, you must also specify the OUT=
option."
What do you think the preferable default behavior should be for SORTEDBY?
-- TMK --
"The Macro Klutz"
On Thu, 11 May 2006 14:20:26 -0400, Arthur Tabachneck
<art297@NETSCAPE.NET> wrote:
>Given the interesting discussion, yesterday, about how to identify an
>unknown sort order, someone had suggested the "sorted by" data step
option.
>
>That didn't correspond with what I had remembered reading about the
>option, thus I tried the following test (with the noted results):
>
>data chk(sortedby=visid patid);
> do patid=1 to 20;
> do visid=1 to 3;
> var1=ranuni(0);
> var2=ranuni(5);
> output;
> end;
> end;
>run;
>
>proc sort data=chk;
> by visid patid;
>run;
>
>NOTE: Input data set is already sorted, no sorting done.
>
>proc means data=chk mean;
> var var1;
> by visid patid;
>run;
>
>ERROR: Data set WORK.CHK is not sorted in ascending sequence.
>The current by-group has visid = 3 and the next by-group has visid = 1.
>
>Fascinating!
>
>Art