Date: Mon, 13 Nov 2006 17:28:22 -0800
Reply-To: Sekhar <ckalisetty@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sekhar <ckalisetty@GMAIL.COM>
Subject: Re: Sorting a huge huge dataset
Content-Type: text/plain; charset="us-ascii"
Here is the info.
I am running SAS 9.1 and yes, these are SAS datasets.
There are other people using the same box.
I don't know about any limits on the RAM usage. Will try to find
tomorrow and let you know.
We have multiple CPU's and there is no Syncsort installed and these
datasets are local to this box.
David L Cassell wrote:
> ckalisetty@GMAIL.COM wrote back:
> >On Nov 13, 3:05 pm, sseg...@gmail.com wrote:
> > > I disagree with myself. I was written an email telling me this was not
> > > the case. I have now compared proc sort and proc sql run times on a
> > > 1.5gb data set that was randomly composed. I got faster run times
> > > using proc sort. I never should have believed the person who told me
> > > this.
> > >
> > >
> > >
> > > sseg...@gmail.com wrote:
> > > > Using proc sql with an order statement should be considerably faster.
> > > > Proc sort is extremely inefficent in the data step. It also requires
> > > > lot of free disk space, the same amount as the data set itself. Use a
> > > > code like this.
> > > > proc sql;
> > > > create table table name as select*
> > > > from same table name as above
> > > > order by acct_number (if you must sort by multiple things just
> > > > with commas);
> > > > quit;
> > >
> > > > There are other suggestions that might work, but I think that if you
> > > > know proc sql; you should never use proc sort. You could also do a
> > > > join and not have sort at all, but a lot of people don't like the sql
> > > > join in SAS> Other things might work, but this should work as well.
> > > > Sekhar wrote:
> > > > > Hi
> > > > > I am trying to merge two datasets one having 1.7 billion records and
> > > > > the other one having 35 million records. The first datset has
> > > > > number and the second datset has opeinign date. The merge is on
> > > > > number. So far so good.But sorting the first file on account number
> > > > > taking almost 8 hours.Any other alternatives for doing this or any
> > > > > other methods for getting the opening date in to the first file. Tag
> > > > > sort, Proc SQl, Indexing?? Any suggestions? Should cahnge the system
> > > > > options like memsize,sortsize etc.. I am more concerned about
> > > > > down the time than space issues.- Hide quoted text -- Show quoted
> >text -
> >Sorry forgot to mention that I am runnignt this under AIX.
> You also forgot to tell us the version of SAS, and whether these
> are SAS data sets or external data sets that get pulled into SAS,
> and a bunch of other stuff. :-)
> My advice still stands. But let me ask a couple *more* questions
> in addition to the above ones (which would really help!):
> Are you sharing this AIX box with other people?
> What limits have the IT people put on RAM and time-slicing
> for any one app running on the OS?
> Do you have 1 CPU or multiple CPUs?
> Are the data sets local, or are they on other machines so you
> have to pull stuff across a network to work with them?
> Do you have SyncSort on this machine?
> David L. Cassell
> mathematical statistician
> Design Pathways
> 3115 NW Norwood Pl.
> Corvallis OR 97330
> Try Search Survival Kits: Fix up your home and better handle your cash with
> Live Search!