Date: Mon, 15 Aug 2011 17:17:59 -0500
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: Timing of lag() function
In-Reply-To: <B90B817A9BBB904AAAD7EC321C01596745B740@USFCH-MAIL1.lewin.com>
Content-Type: text/plain; charset=ISO-8859-1
I think it's because the lag(new_eff_date) is never run at the time that a
new_eff_date record has a value. LAG is just creating a queue for a value
and then pulling from it; so in your code, it only puts missing values into
said queue.
I think in this case you should use RETAIN instead of LAG, that's really
what you're trying to do (keep the value of new_eff_date from observation to
observation). You don't really need to use LAG at all along with retain -
just use retain and you will have last record's New_eff_date immediately
handy (and then can go to work adjusting it as needed). The reason you get
a LAG nonmissing on #3 and #4 is that on #2, you have no value for
new_eff_date in the queue (this is the first time it's hit the lag
statement); it didn't hit the lag statement on record #1. You'd need to put
your LAG in there outside the if statements entirely (where 'x' is defined
for example) to get that effect; but again, I think just retain with no lag
at all for new_eff_date is fine.
-Joe
On Mon, Aug 15, 2011 at 5:10 PM, Kirby, Ted <ted.kirby@lewin.com> wrote:
> With the following dataset:
>
>
>
> data coverage3;
>
> input individual_id :$8. Eff_Date :date9. end_date :date9. Cust_ID :$9.
> count_index;
>
> format Eff_Date end_date date9.;
>
> datalines;
>
> 39030981 01Jan2009 30Apr2009 000192961 1
>
> 39030981 01May2009 31May2009 000192961 2
>
> 39030981 01Jun2009 30Sep2009 000192961 3
>
> 39030981 01Oct2009 31Dec2009 000192961 4
>
> 39121557 10Oct2008 30Nov2008 000189496 1
>
> ;
>
> run;
>
>
>
> and the following code:
>
>
>
> proc sort data=coverage3; by individual_id Eff_date; run;
>
> /* The data are sorted in the INPUT data, but run the PROC SORT so that
> SAS knows it is sorted and we can use the BY statement below. */
>
> data coverage3_eff;
>
> set coverage3;
>
> by individual_id;
>
>
>
> x = lag(eff_date);
>
> y = lag(end_date);
>
> z = lag(cust_id);
>
> if first.individual_id then new_eff_date = eff_date;
>
> else do;
>
> w = lag(new_eff_date);
>
> if eff_date - y >= 90 then new_eff_date = eff_date;
>
> if eff_date - y < 90 and cust_id ^= z then new_eff_date =
> eff_date;
>
> if eff_date - y < 90 and cust_id = z and count_index <= 2 then
> new_eff_date = x;
>
> if eff_date - y < 90 and cust_id = z and count_index > 2 then
> new_eff_date = w;
>
> end;
>
> format new_eff_date x y w date9.;
>
> run;
>
>
>
> Why is the variable "w" missing for all observations? The
> "new_eff_date" variable was assigned a value with the first run through
> the data statement (with the "if first. Individual_id . . . "
> statement), so I would have thought that subsequent observations would
> have had a value for "w" (especially the 2nd observation).
>
>
>
> This happens even if "w" is defined outside of the conditional IF in the
> same block of code as the variables "x" "y" and "z" are defined.
>
>
>
> If I add a "RETAIN new_eff_date;" statement to the code above then "w"
> has a value for the 3rd and 4th observations, but not the 2nd or 5th
> observation. This is fine for the 5th observation, since it is the
> beginning of the new "individual_id" block within the data. However, I
> want there to be a value for "w" in the 2nd observation.
>
>
>
> In all the variations of the code above, all of the "lag" variables "x"
> "y" and "z" have non-missing values. Only "w" has missing values. How
> can I get the "w" in the 2nd observation to have the value of the
> "new_eff_date" from the first observation?
>
>
> ************* IMPORTANT - PLEASE READ ********************
>
> This e-mail, including attachments, may include confidential and/or
> proprietary information,
> and may be used only by the person or entity to which it is addressed. If
> the reader of this
> e-mail is not the intended recipient or his or her authorized agent, the
> reader is hereby
> notified that any dissemination, distribution or copying of this e-mail is
> prohibited. If you
> have received this e-mail in error, please notify the sender by replying to
> this message
> and delete this e-mail immediately.
>
>
|