Date: Mon, 25 Jan 2010 10:09:51 -0800
Reply-To: "Richard A. DeVenezia" <rdevenezia@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Richard A. DeVenezia" <rdevenezia@GMAIL.COM>
Organization: http://groups.google.com
Subject: Re: Multiple data sets based on a variable
Content-Type: text/plain; charset=ISO-8859-1
On Dec 23 2009, 12:31 pm, dynamicpa...@YAHOO.COM (oloolo) wrote:
> I think this issue over again and come up the following solution:
> HoH provides us the capability to not-knowing a prior how many output
> datasets we need in the DATA statement, so that it really doesn't matter
> that we have to read through the WHOLE file at once if it is really too
> huge to fit into the memory (say >2GB in a 32-bit machine).
>
> Then we can split the data first by reading say first 25% first, output all
> necessary sub data sets; then the rest 75% in a 25% incremental.
>
> At the final step, we use PROC APPEND to glue these small data sets up. The
> complexity is still linear in number of observations
See "Hash based data splitter for limited resources"
http://groups.google.com/group/comp.soft-sys.sas/search?group=comp.soft-sys.sas&q=Hash+based+data+splitter+for+limited+resources
Should probably put all this HoH stuff in sascommunity and just point
to it when needed.
--
Richard A. DeVenezia
http://www.devenezia.com
|