Date: Tue, 23 Feb 2010 21:47:18 -0600
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: an efficiency issue in proc sql
In-Reply-To: <201002240214.o1NNEmkb028153@malibu.cc.uga.edu>
Content-Type: text/plain; charset=ISO-8859-1
Don't use SQL to do this; SAS will do it well in the data step:
data want;
set mydata;
by date;
totalret+return;
run;
In a more complicated situation, do whatever calculation you're doing in a
similar fashion, using RETAIN and whatever conditions you need. You aren't
going to get the best answer without posting to the list the specifics,
though.
-Joe
On Tue, Feb 23, 2010 at 8:14 PM, Tony Wang <wangtong@usc.edu> wrote:
> Hi,all
> I am looking for a more efficient method to do the following:
>
> I have a data set (called mydata) with columns date and return
> and I would like to create a new column that contains the sum of past k
> returns
> for the simplest case (k=2):
> date return
> 2000/1/1 1
> 2000/1/2 1.1
> 2000/1/3 .9
> then the new data set should be
> date return totalret
> 2000/1/2 1.1 2.1 (namely 1+1.1)
> 2000/1/3 .9 2
> (note that the sum operation and the range of the data in sum are only
> provided as an simple example, in my work, I actually have to perform more
> complicated computation and over some range that satisfies more conditions,
> thus a method only works for this particular case is not quite helpful)
>
> I used the following sql to do this
>
> proc sql;
> create table newdata
> as select distinct a.date, a.return, sum(b.return) as totalret
> from mydata as a , mydata as b
> where 0<=intck('day',b.date,a.date)<=1
> having count(b.return)=2;
> quit;
>
>
> This does the job, but I realized that it probably does it twice (k times
> in
> general) !
> Because if I drop 'distinct', I see two copies of the desired result.
> I am guessing that ,with 'distinct' ,the program still compute everything
> twice, then delete redundent copies.
>
> As you can image, this is really wasteful when k is large.
>
>
> Is there anyway to get around this ? Any help would be appreciated. thanks
> .
>
|