| Date: | Sun, 31 May 1998 15:45:38 -0400 |
| Reply-To: | Li Quan <lquan@GARNET.ACNS.FSU.EDU> |
| Sender: | "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU> |
| From: | Li Quan <lquan@GARNET.ACNS.FSU.EDU> |
| Subject: | Re: Sample, Thanks and Others |
|
| In-Reply-To: | <199805311640.RAA03422@smtp-relay.power.net.uk> |
| Content-Type: | TEXT/PLAIN; charset=US-ASCII |
|---|
John:
I worked with your program on my data, which took about 2 seconds.
Jingren's program took about 20 seconds. I think this is the case where we
can take different routes to the same destination, but there probably is a
more efficient one.
Thanks to both of you for making this rather interesting!
Best,
Quan
***********************************************************************
On Sun, 31 May 1998, John Whittington wrote:
> At 00:17 31/05/98 -0400, Li Quan wrote:
>
> >Jingren's new program works great. The estimates for each model are
> >stacked in the results dataset, which makes the second stage analysis much
> >easier.
> >John's suggestion sounds very plausible (first, generate the datasets,
> >then estimate the models with a by statement). But I haven't got a chance
> >to test it. It would be very interesting to compare the performance of
> >these two in terms of the CPU time, etc.
>
> Quan, as I would have expected, there is a dramatic difference in
> performance - although with small datasets it wouldn't be enough to fuss
> about. 'My' method (code below) which also stacks all the results in a
> single dataset (with just one run of PROC REG), can deal with 1000
> observations in about half the time that Jingren's takes to do 60 (all times
> in seconds):
>
> 'My' Method - 60 observations:
> STARTED=61479.03 FINISH=61481.12 ELAPSED=2.09
> 'My' Method - 200 observations:
> STARTED=62415.68 FINISH=62420.68 ELAPSED=5.00
> 'My' Method - 1000 observations:
> STARTED=61657.43 FINISH=61678.85 ELAPSED=21.42
>
> Jingren's Method - 60 observations:
> STARTED=61764.87 FINISH=61802.93 ELAPSED=38.06
> Jingren's Method - 200 observations:
> STARTED=61867.3 FINISH=62022.8 ELAPSED=155.5
> Jingren's Method - 1000 observations
> ** programme crashes after about 238 iterations because
> of inability to create more output dataset 'handles'
>
> This also illustrates another problem with the 'macro %do loop' approach
> when dealing with many iterations - if one creates a separate output dataset
> for each iteration, and then combines them all at the end, one can run into
> problems - as you can see, my SAS installation got upset after about 238
> iterations. One can avoid that problem by using a PROC APPEND within the
> macro %do loop, thereby just building up a single results dataset one
> observation at a time - but that would make the %do loop considerably slower
> still.
>
> CREATE TEST DATASET:
>
> data test ;
> do month = 1 to 1000 ;
> a = ranuni (459274) ;
> b = ranuni (134563) ;
> output ;
> end ;
> run ;
>
> MY METHOD .....
>
> data _null_ ; t = time() ; call symput('start',t) ; /* for timing */
> data samples (drop = i) ;
> row = _n_ ;
> set test nobs = num end = eof ;
> array ar(10000, 3) _temporary_ ;
> ar(row, 1) = month ; ar(row, 2) = a ; ar(row, 3) = b ;
> if eof then do ;
> do run = 1 to num - 11 ;
> do i = run to (run + 11) ;
> month = ar(i, 1) ; a = ar(i, 2) ; b = ar(i, 3) ;
> output ;
> end ;
> end ;
> end ;
> run ;
>
> proc reg noprint outest = results ;
> model month = a b ;
> by run ;
> run ;
>
> data _null_ ; /* for timing */
> started = &start ;
> finish = time() ;
> elapsed = finish - started ;
> put started= finish= elapsed= ;
> run ;
>
> JINGREN'S METHOD ....
>
> data test ;
> do month = 1 to 60 ;
> a = ranuni (459274) ;
> b = ranuni (134563) ;
> output ;
> end ;
> run ;
>
> options nosymbolgen;
> %let n=60; /* this is the total number of obs */
> %let datasets=;
> %let i=1;
>
> %macro doreg;
> %do %until(&i>&n-12+1);
>
> proc reg noprint data=test(where=(&i<=month<=&i+12-1)) outest=res&i;
> model month = a b ;
> quit;
> run;
>
> %let datasets= &datasets res&i;
> %let i=%eval(&i+1);
> %end;
>
> data results;
> set &datasets;
> run;
>
>
> %mend;
>
> data _null_ ; t = time() ; call symput('start',t) ; /* for timing */
> %doreg;
> data _null_ ; /* for timing */
> started = &start ;
> finish = time() ;
> elapsed = finish - started ;
> put started= finish= elapsed= ;
> run ;
>
>
> Regards,
>
> John
>
> ----------------------------------------------------------------
> Dr John Whittington, Voice: +44 (0) 1296 730225
> Mediscience Services Fax: +44 (0) 1296 738893
> Twyford Manor, Twyford, E-mail: medisci@powernet.com
> Buckingham MK18 4EL, UK mediscience@compuserve.com
> ----------------------------------------------------------------
>
|