Date: Wed, 15 Mar 2006 17:48:35 +0000
Reply-To: toby dunn <tobydunn@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: toby dunn <tobydunn@HOTMAIL.COM>
Subject: Re: SQL vs Data Step (was Sum a column)
In-Reply-To: <200603151651.k2FGmB8m020418@mailgw.cc.uga.edu>
Content-Type: text/plain; format=flowed
Art ,
Your welcome. Its a Howard interleaving a data set back on itself thingy.
As for the SQL taking longer to run, well youd have to ask Howard or Sig as
the mysteries of the SQL optimizer still confuse me. Perhaps there is some
way to rewrite the SQL to make it run faster, but I'll be danged if I know
what it is.
Toby Dunn
From: Arthur Tabachneck <art297@NETSCAPE.NET>
Reply-To: Arthur Tabachneck <art297@NETSCAPE.NET>
To: SAS-L@LISTSERV.UGA.EDU
Subject: SQL vs Data Step (was Sum a column)
Date: Wed, 15 Mar 2006 11:51:54 -0500
Toby,
First, thanks for teaching me something new! After seeing that your data
step solution actually worked, I modified your example, slightly, and
increased the number of records to 5,000,000.
To my surprise, the data step approach had real times of between 3.15 and
3.17 seconds, while the SQL approach had times ranging between 5.48 and
15.37 seconds.
The code I tested was:
data temp;
input id $ empl;
do i=1 to 1000000;
output;
end;
datalines;
1 1
2 1
3 2
4 2
5 1
;
run;
proc sql;
create table Result as
select *, sum(empl) as sum
from temp;
quit;
data Result;
set temp (in=a) temp (in=b);
if a then sum+empl;
else output;
run;
Art
-------
On Wed, 15 Mar 2006 15:56:46 +0000, toby dunn <tobydunn@HOTMAIL.COM> wrote:
>RasmusJo ,
>
>Well SQL would be the easiest solution Jiann I think did that one and for
>those who are SQL adverse there us always a data step solution:
>
>data one ;
>infile cards ;
>input id empl ;
>cards ;
>1 1
>2 1
>3 2
>4 2
>5 1
>run ;
>
>data two ;
>set one ( in = First )
> one ( in = Second ) ;
>
>if First then do ;
> Sum + Empl ;
>end ;
>
>if Second then output ;
>
>run ;
>
>
>proc print
>data = two ;
>run ;
>
>
>
>Toby Dunn
>
>
>
>
>
>From: RasmusJo <rjo.cebr@CBS.DK>
>Reply-To: RasmusJo <rjo.cebr@CBS.DK>
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: Sum a column
>Date: Wed, 15 Mar 2006 07:22:42 -0800
>
>Hi comp.soft-sys.sas,
>
>I'm having a brain melt-down as I try to sum a column/variable.
>
>My problem is that I would like to sum the variables empl, see below:
>
>id empl
>1 1
>2 1
>3 2
>4 2
>5 1
>
>such that I get the variable 'sum'
>
>id empl sum
>1 1 7
>2 1 7
>3 2 7
>4 2 7
>5 1 7
>
>So far I only managed to create an accumulating variable whose value is
>7 for the last observation.However, I would like to create a variable
>that is 7 for all observations.
>
>Can anyone help`?
>
>Thanks,
>Rasmus
|