Date: Tue, 19 Feb 2008 08:13:26 -0500
Reply-To: Arthur Tabachneck <art297@NETSCAPE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Arthur Tabachneck <art297@NETSCAPE.NET>
Subject: Why does retain work faster conditionally?
Content-Type: text/plain; charset=ISO-8859-1
One of our most respected list members wrote me off-line, asking why in
the world I would have suggested wrapping a retain statement within a
condition.
That is, given the following data:
data have;
input lname$ fname$;
do i=1 to 1000000;output;end;
cards;
lname1 fname1
lname2 fname2
;
why write:
data want;
if _n_ eq 1 then do;
retain fname;
end;
set have;
run;
instead of:
data want;
retain fname;
set a;
run;
I know why I provided the solution, because it had better performance, but
I could sure use some feedback explaining why that would be so.
I initially wrote it correctly and, upon seeing that it worked slower than
Jiann's SQL solution, tried to see if I could bypass reading the data
(i.e., when _n_ eq 0).
After I soon realized that wouldn't be possible, I ran the step as
presented.
Someone please explain to me why:
60 data want;
61 if _n_ eq 1 then do;
62 retain fname;
63 end;
64 set a;
65 run;
NOTE: There were 2000000 observations read from the data set WORK.A.
NOTE: The data set WORK.WANT has 2000000 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 1.12 seconds
cpu time 1.12 seconds
runs almost 50% faster than:
56 data want;
57 retain fname;
58 set a;
59 run;
NOTE: There were 2000000 observations read from the data set WORK.A.
NOTE: The data set WORK.WANT has 2000000 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 1.43 seconds
cpu time 1.43 seconds
I ran the tests on a 4-processor Window's 2003 system with 12 gig of ram
and SAS 9.1.3. It was during a holiday, thus I was the only one using the
computer and I re-ran the tests 3 times with the same results.
Art
--------
On Mon, 18 Feb 2008 23:21:23 -0500, Arthur Tabachneck
<art297@NETSCAPE.NET> wrote:
>Miguel,
>
>As Jiann indicated, you can do what you want with proc sql. However, you
>can also accomplish the same thing in a data step. For example,
>
>data have;
> input lname$ fname$;
> do i=1 to 1000000;output;end;
> cards;
> lname1 fname1
> lname2 fname2
> ;
>
>data want;
> if _n_ eq 1 then do;
> retain fname;
> end;
> set have;
>run;
>
>HTH,
>Art
>---------
>On Tue, 19 Feb 2008 02:55:04 +0000, Miguel de la Hoz <miguel_hoz@YAHOO.ES>
>wrote:
>
>>I am starting my problem with the following disposal of my dataset:
>
># variable
>1 lname
>2 fname
>
>I am trying to export it to excel but it is keeping that order. I would
>like to be able to write
>
># variable
>1 fname
>2 lname
>
>This is only an example my dataset contains around 20 fields.
>
>Thanks.
>
>MDH.
>
>
>
>______________________________________________
>¿Con Mascota por primera vez? Sé un mejor Amigo. Entra en Yahoo!
>Respuestas http://es.answers.yahoo.com/info/welcome