Date: Sun, 3 May 2009 20:10:31 -0700
Reply-To: barry.brian.barrios@GMAIL.COM
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: barry.brian.barrios@GMAIL.COM
Organization: http://groups.google.com
Subject: Re: Running Code, Trying to understand how it works.
Content-Type: text/plain; charset=ISO-8859-1
On May 3, 5:45 pm, djnordl...@VERIZON.NET (Daniel Nordlund) wrote:
> > -----Original Message-----
> > From: SAS(r) Discussion [mailto:SA...@LISTSERV.UGA.EDU] On
> > Behalf Of Arthur Tabachneck
> > Sent: Sunday, May 03, 2009 1:11 PM
> > To: SA...@LISTSERV.UGA.EDU
> > Subject: Re: Running Code, Trying to understand how it works.
>
> > Priyanka,
>
> > As shown in the following code, I think that each call of a
> > lag statement
> > simply establishes a que with the first value being missing, and each
> > subsequent value being whatever value the variable had in the
> > PDV at the
> > time the call was made.
>
> > If there are two uses of the lag functions, there will be two
> > independent
> > ques, each starting with a missing value and then containing
> > all subsequent
> > times the call is confronted.
>
> > data test5;
> > do i=1234 to 1240;
> > end;
> > do i=1 to 10;
> > l_i=lag(i);
> > holdi=i;
> > i=30-i;
> > l_from30_i=lag(i);
> > i=holdi;
> > output;
> > end;
> > run;
>
> > results in the following file:
>
> > Obs i l_i holdi l_from30_i
>
> > 1 1 . 1 .
> > 2 2 1 2 29
> > 3 3 2 3 28
> > 4 4 3 4 27
> > 5 5 4 5 26
> > 6 6 5 6 25
> > 7 7 6 7 24
> > 8 8 7 8 23
> > 9 9 8 9 22
> > 10 10 9 10 21
>
> > Art
>
> Art is exactly correct here. As he points out, for each separate mention of
> the lag() function in a data step, a separate queue is set up at compile
> time. Lag() or lag1() sets up a queue of length 1, lag2() sets up a queue
> of length 2, etc. Lag is maybe an unfortunate name because it causes most
> people to think of a value from the immediately previous record in a data
> step. In the following data step, lag() will work work like people might
> think it should.
>
> Data want;
> set have;
> lag_x = lag(x);
> Run;
>
> Here lag_x will always have the value of x from the previous record, but it
> gets it from the queue, not by looking at the previous record..
>
> In Art's code above, 2 different queues are set up, one for each mention of
> lag(), and initialized to missing. When the first lag() function is
> executed,
>
> l_i=lag(i);
>
> the current value of the first queue is returned (this will missing the very
> first time) and then the current value of i is pushed onto the queue. The
> next time this particular lag() is executed, the value returned will be
> whatever was pushed onto the queue the last time it was executed. Then the
> current value of I is pushed onto the queue, overwriting what was there
> previously. Notice, the value is returned from the queue first, then the
> new value is pushed onto the queue. This is why if one uses lag()
> conditionally, it doesn't return the value of a variable from the
> immediately preceding record. It is not looking at the previous record, it
> is returning whatever value is currently in the queue. The queue which is
> set up is a first-in/first-out queue.
>
> If one uses say lag2() in a program, a queue is set up with length 2, both
> locations initialized to zero.
>
> Data test;
> do I = 1 to 10;
> lag_i = lag(i);
> output;
> end;
> Run;
>
> The first time lag2(i) is executed a missing value is returned from the head
> of the queue, and 1 is pushed onto the tail of the queue. The second time,
> another missing value is returned (remember, length 2 queue was set up with
> both locations initialized to missing) and 2 is pused onto the tail of the
> queue. The third time through the loop, lag2(i) will return the value 1
> from the head of the queue and push the value 3 onto the tail of the queue.
> So lag always returns a value from the head of the queue and pushes the
> current value of the argument to lag() onto the tail of the queue.
>
> Hope this is helpful,
>
> Dan
>
> Daniel Nordlund
> Bothell, WA USA
Thanks for the great explanation. Now I was trying to apply your
explanation to my problem.
libname home '/home/mit/bcubeb3';
data home.one;
input date permno x y z w;
datalines;
. 4 . . . .
1 4 . . . .
2 4 10 3 3990 .
3 4 . . . 3680
4 4 . . . .
5 4 . . . .
6 4 . . . .
7 4 . . . .
8 4 . . . .
9 4 . . . .
10 4 . . . 3680
11 4 . . . .
12 4 . . . .
13 4 . . . .
14 4 . . . .
15 4 . . . .
16 4 . . . 3793
17 4 . . . .
18 4 . . . .
19 4 . . . .
20 5 . . . 3843
21 5 . . . .
22 5 20 2 4000 .
23 5 . . . .
24 5 . . . .
;
run;
data home.two;
set home.one;
by permno date;
i=0;
do while(i<1);
lx=lag(x);
ly=lag(y);
lz=lag(z);
lw=lag(w);
if permno=lag(permno) and x=. then x=lx;
if permno=lag(permno) and y=. then y=ly;
if permno=lag(permno) and z=. then z=lz;
if permno=lag(permno) and w=. then w=lw;
i+1;
end;
run;
data home.three;
set home.one;
by permno date;
i=0;
do while(i<2);
lx=lag(x);
ly=lag(y);
lz=lag(z);
lw=lag(w);
if permno=lag(permno) and x=. then x=lx;
if permno=lag(permno) and y=. then y=ly;
if permno=lag(permno) and z=. then z=lz;
if permno=lag(permno) and w=. then w=lw;
i+1;
end;
run;
How do you explain how data set Three was created. I looked at the
adjacent values but I am still not sure how the explanation of queues
relate to the understanding of data set three.
-Barry
|