Date: Fri, 21 Mar 2008 08:30:39 -0500
Reply-To: sas 9 bi user <sas9bi@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: sas 9 bi user <sas9bi@GMAIL.COM>
Subject: Re: Confused about lag1()
In-Reply-To: <941871A13165C2418EC144ACB212BDB04E1542@dshsmxoly1504g.dshs.wa.lcl>
Content-Type: text/plain; charset=ISO-8859-1
Looking at Dan's below solution to the lag question, is the retain statment
necessary?
I am new to SAS and I ran the below w and w/o the retain it it worked fine.
Is the retain used more for best practice or am I missing something?
Thanks, Andy
data b(drop=lag_month );
set a;
by id;
if first.id then timewithkids = 0;
retain timewithkids;
lag_month = lag(month);
if lag(gotkids) EQ 1 and not first.id
then timewithkids + (month - lag_month);
run;
On 3/20/08, Nordlund, Dan (DSHS/RDA) <NordlDJ@dshs.wa.gov> wrote:
>
> > -----Original Message-----
> > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On
> > Behalf Of Jeremy Miles
> > Sent: Thursday, March 20, 2008 5:39 PM
> > To: SAS-L@LISTSERV.UGA.EDU
> > Subject: Confused about lag1()
> >
> > Hi All,
> >
> > I'm trying to use the lag1 (or lag) function, I'm getting in a tangle,
> > and I don't really understand why.
> >
> > My data are long (i.e. a person occupies several rows). People are
> > assessed periodically, and asked, on each occasion, if they have
> > children.
> >
> > I want to know how long they've had children for. However, I'm not
> > getting the results I expect from lag()
> >
> >
> > data a;
> > input id month gotkids ;
> > cards;
> > 1 0 0
> > 1 3 1
> > 1 6 1
> > 1 9 1
> > 1 12 1
> > 2 0 0
> > 2 6 0
> > 2 9 1
> > 2 12 1
> > 3 0 0
> > 3 3 1
> > 3 6 1
> > 3 18 1
> > 3 21 1
> > 3 24 1
> > 4 0 0
> > 4 12 0
> > 4 15 0
> > 4 18 1
> > ;
> > run;
> > proc print data=a;
> > run;
> >
> > data b; set a;
> > timewithkids = 0;
> > if lag1(gotkids) EQ 1 and lag1(id) EQ id then
> > timewithkids = month -
> > lag1(month) + lag1(timewithkids);
> > /*reasoning: month is now, lag1(month) is previous measurement
> > occasion, lag1(timewithkids) is how long they'd already spent*/
> > run;
> > proc print data=b;
> > run;
> >
> > Which gives:
> >
> >
> > Obs id month gotkids
> > timewithkids
> >
> > 1 1 0 0 0
> > 2 1 3 1 0
> > 3 1 6 1 .
> > 4 1 9 1 3
> > 5 1 12 1 3
> > 6 2 0 0 0
> > 7 2 6 0 0
> > 8 2 9 1 0
> > 9 2 12 1 0
> > 10 3 0 0 0
> > 11 3 3 1 0
> > 12 3 6 1 -6
> > 13 3 18 1 12
> > 14 3 21 1 3
> > 15 3 24 1 3
> > 16 4 0 0 0
> > 17 4 12 0 0
> > 18 4 15 0 0
> > 19 4 18 1 0
> >
> > This doesn't look anything like I expect. But I don't understand why.
> >
> > Thanks,
> >
> > Jeremy
> >
>
> Jeremy,
>
> You are probably confused about how lag() works. Lag() doesn't look at
> the previous record to get the value to return. The lag1() function causes
> SAS to set up a first-in/first-out queue at compile time. Then at run time,
> when the lag1() function is executed, the value at the head of the queue is
> returned and the current value of the variable in the lag() function is
> pushed on to the tail of the queue.
>
> So each time lag1() is executed it returns the value that was pushed on to
> he queue the last time it was executed. If you execute lag() every time
> through the data step, it acts as if you are getting the value from the
> previous record.
>
> However, in your program the statement
>
> if lag1(gotkids) EQ 1 and lag1(id) EQ id then
> timewithkids = month - > lag1(month) + lag1(timewithkids);
>
> executes lag1(timewithkids) conditionally. So the value used in the
> computation is not value of timewithkids from the previous record, it is the
> value of timewithkids the last time the IF conditional expression was
> true. It is usually not a good idea to execute lag() conditionally.
>
> In addition to your confusion about lag(), you are re-setting the value of
> timewithkids equal to 0 every time through the data step loop.
>
> I would code something like this:
>
>
> data b(drop=lag_month );
> set a;
> by id;
>
> if first.id then timewithkids = 0;
> retain timewithkids;
>
> lag_month = lag(month);
>
> if lag(gotkids) EQ 1 and not first.id
> then timewithkids + (month - lag_month);
> run;
>
> Hope this is helpful,
>
> Dan
>
>
> Daniel J. Nordlund
> Research and Data Analysis
> Washington State Department of Social and Health Services
> Olympia, WA 98504-5204
>
|