| Date: | Wed, 3 Feb 2010 11:20:38 -0600 |
| Reply-To: | Joe Matise <snoopy369@GMAIL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Joe Matise <snoopy369@GMAIL.COM> |
| Subject: | Re: lag calculation macro |
|
| In-Reply-To: | <201002031647.o13GaGnq005735@malibu.cc.uga.edu> |
| Content-Type: | text/plain; charset=ISO-8859-1 |
|---|
Art's suggestion is the right one IMO for your specific problem.
As a general critique, most of the time when a "utility" macro contains a
full datastep, it's the wrong answer. Utility macros should generally be
less than one datastep; ie, called inside a datastep. Otherwise you're just
using the macro to do data step looping, which is much more cheaply handled
with data step functions and statements.
For example, you could do
data want;
set test;
n=_n_;
if missing(x) then do;
do until (not missing(x));
n=n-1;
set test(keep=x) point=n;
end;
end;
run;
which is nearly identical to your macro in its approach, but uses data step
looping instead of the macro, and will be substantially more efficient.
A utility macro to accomplish this would be something like this:
%macro rollback(var);
%do i = 1 %to 8;
&var=ifn(&var,&var,0,lag&i(&var));
%end;
%mend rollback;
data want;
set test;
%rollback(x);
run;
Note how it is called from inside a data step. I don't recommend this
particular solution as there are better ways [in particular, this executes
many statements needlessly and has to be hardcoded with the maximum possible
number of lag values] but it illustrates my point.
-Joe
On Wed, Feb 3, 2010 at 10:47 AM, Robert Feyerharm <robertf@health.ok.gov>wrote:
> Has anyone else struggled with the lag function in SAS? The lag function
> often doesn't work in an intuitive manner in conditional statements, or if
> missing values are involved.
>
> Anyways, I wrote a short macro code to handle lag calculations which I'd
> like to share. It works fairly well, although I may be reinventing the
> wheel. A colleague was working with a large dataset containing missing
> values, and wanted to impute the last known value into the missing values
> for a particular field. For example, given the following test dataset, the
> macro fills in the missing values of x with the last known value of x
> (1,2, or 3):
>
> data test;
> input x y;
> datalines;
> 1 1
> . 2
> . 3
> . 4
> . 5
> 2 6
> . 7
> . 8
> 3 9
> . 10
> . 11
> . 12
> . 13
> . 14
> ;
> run;
>
> /* Let num = no. of records in target dataset. */
> %let num=14;
>
> %macro lagvar(num);
> %do i = 1 %to #
> data test;
> set test;
> z=lag(x);
> if x=. then x=z;
> run;
> %end;
>
> data test;
> set test;
> drop z;
> run;
> %mend;
>
> %lagvar(&num)
>
> I'm curious if there is a shorter program that can handle the same
> operations without resorting to a macro?
>
> Robert Feyerharm
> Oklahoma State Department of Health
>
|