LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 1998, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 27 Oct 1998 08:32:39 -0500
Reply-To:     "Fehd, Ronald J." <rjf2@CDC.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         "Fehd, Ronald J." <rjf2@CDC.GOV>
Subject:      Re: Sorting
Comments: To: "King, Raymond" <raymondk@MOCR.OAPI.COM>
Content-Type: text/plain

/From: King, Raymond [SMTP:raymondk@MOCR.OAPI.COM] /Does anyone know what algorithm is used as the default with PROC SORT.

No, but you can check out http://www.syncsort.com/ for another good alternative.

/The recent discussion of SORTing efficiency jump-started my curiosity. Paul /Dorfman's 'Insertion Sort' solution to a recent question was well received /and proved to be "three times as fast" as other methods. Yet, Insertion /Sort is on the Order of N**2 in complexity, and there are more efficient /sorting algorithms.

Yeah, but, ... :-q After having spent a week reading and rereading the internal sorting chapter of Knuth's Vol 3, there are some other good sorting algorithms and I got three running: two counting algorithms, and the insertion sort. I may write a paper on this, next year. ;-)

The obvious best parts of the insertion sort are: A. it sorts in place, meaning you have little overhead except the temp variable, B. you can read and comprehend the algorithm ... maybe that ought to be the first item! LOL

/It would be interesting to see how much better things can get using an /algorithm that's on the Order of N complexity.

Don't hold your breath on that one unless N=2. :-\ Bit-flippers -- and I use the term reverently, because they write in machine-language or assembler -- have been working on this problem since the 1930's. The best algorithms that Knuth discusses are on the order of N log N. We're talking programming complexity and overhead now, in order to cut down on the number of comparisons and moves. And of course, our best algorithm will handle character as well as numeric variables.

While I have the floor: I was thinking further of the <sorting 3000 vars, many empty> problem and was wondering if a better solution might not be: A. different data structure: A.1 separate variables for vendors A.2 a linked list which would retain the original data structure, but add overhead, and be simpler to update. or B. an insertion algorithm that runs with the update of each day's data.

Ron Fehd the macro maven CDC Atlanta GA


Back to: Top of message | Previous page | Main SAS-L page