| Date: | Tue, 20 Nov 2007 06:07:49 -0500 |
| Reply-To: | awasas@COX.NET |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Andy Arnold <awasas@COX.NET> |
| Subject: | Re: any way to sort large table faster than proc sort? |
|
| In-Reply-To: | <200711200201.lAJNCRCK004745@mailgw.cc.uga.edu> |
| Content-Type: | text/plain; charset=utf-8 |
Consider creating an index for the dataset. Proc Sort moves entire records, which can be time consuming when the record count or record size is large. An index creates a parallel file with only the sort key and a pointer to the original record; that's much less data to sort. After the indexing, DATA steps would use a BY statement to read the records in the sorted sequence.
I recently inherited a SAS program that sorted a file 15 times; there were 100k+ records and 150-175 fields in the record. I replaced the first sort with a Proc Datasets/Index and removed the other sorts. That single change reduced runtime from over an hour to several minutes.
Hope this helps.
--Andy Arnold
---- "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM> wrote:
> On Mon, 19 Nov 2007 11:00:25 -0500, Wensui Liu <liuwensui@GMAIL.COM> wrote:
>
> >the way i am doing is
> >proc sort data = xxx sortsize = max;.....
> >
> >but i am not very happy with the performance with large data. is there
> >a way to do a faster sort than proc sort?
> >
> >appreciate your insight!
>
> People here have often recommended Syncsort.
|