LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2011, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 15 Aug 2011 17:17:59 -0500
Reply-To:     Joe Matise <snoopy369@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Joe Matise <snoopy369@GMAIL.COM>
Subject:      Re: Timing of lag() function
Comments: To: "Kirby, Ted" <ted.kirby@lewin.com>
In-Reply-To:  <B90B817A9BBB904AAAD7EC321C01596745B740@USFCH-MAIL1.lewin.com>
Content-Type: text/plain; charset=ISO-8859-1

I think it's because the lag(new_eff_date) is never run at the time that a new_eff_date record has a value. LAG is just creating a queue for a value and then pulling from it; so in your code, it only puts missing values into said queue.

I think in this case you should use RETAIN instead of LAG, that's really what you're trying to do (keep the value of new_eff_date from observation to observation). You don't really need to use LAG at all along with retain - just use retain and you will have last record's New_eff_date immediately handy (and then can go to work adjusting it as needed). The reason you get a LAG nonmissing on #3 and #4 is that on #2, you have no value for new_eff_date in the queue (this is the first time it's hit the lag statement); it didn't hit the lag statement on record #1. You'd need to put your LAG in there outside the if statements entirely (where 'x' is defined for example) to get that effect; but again, I think just retain with no lag at all for new_eff_date is fine.

-Joe

On Mon, Aug 15, 2011 at 5:10 PM, Kirby, Ted <ted.kirby@lewin.com> wrote:

> With the following dataset: > > > > data coverage3; > > input individual_id :$8. Eff_Date :date9. end_date :date9. Cust_ID :$9. > count_index; > > format Eff_Date end_date date9.; > > datalines; > > 39030981 01Jan2009 30Apr2009 000192961 1 > > 39030981 01May2009 31May2009 000192961 2 > > 39030981 01Jun2009 30Sep2009 000192961 3 > > 39030981 01Oct2009 31Dec2009 000192961 4 > > 39121557 10Oct2008 30Nov2008 000189496 1 > > ; > > run; > > > > and the following code: > > > > proc sort data=coverage3; by individual_id Eff_date; run; > > /* The data are sorted in the INPUT data, but run the PROC SORT so that > SAS knows it is sorted and we can use the BY statement below. */ > > data coverage3_eff; > > set coverage3; > > by individual_id; > > > > x = lag(eff_date); > > y = lag(end_date); > > z = lag(cust_id); > > if first.individual_id then new_eff_date = eff_date; > > else do; > > w = lag(new_eff_date); > > if eff_date - y >= 90 then new_eff_date = eff_date; > > if eff_date - y < 90 and cust_id ^= z then new_eff_date = > eff_date; > > if eff_date - y < 90 and cust_id = z and count_index <= 2 then > new_eff_date = x; > > if eff_date - y < 90 and cust_id = z and count_index > 2 then > new_eff_date = w; > > end; > > format new_eff_date x y w date9.; > > run; > > > > Why is the variable "w" missing for all observations? The > "new_eff_date" variable was assigned a value with the first run through > the data statement (with the "if first. Individual_id . . . " > statement), so I would have thought that subsequent observations would > have had a value for "w" (especially the 2nd observation). > > > > This happens even if "w" is defined outside of the conditional IF in the > same block of code as the variables "x" "y" and "z" are defined. > > > > If I add a "RETAIN new_eff_date;" statement to the code above then "w" > has a value for the 3rd and 4th observations, but not the 2nd or 5th > observation. This is fine for the 5th observation, since it is the > beginning of the new "individual_id" block within the data. However, I > want there to be a value for "w" in the 2nd observation. > > > > In all the variations of the code above, all of the "lag" variables "x" > "y" and "z" have non-missing values. Only "w" has missing values. How > can I get the "w" in the 2nd observation to have the value of the > "new_eff_date" from the first observation? > > > ************* IMPORTANT - PLEASE READ ******************** > > This e-mail, including attachments, may include confidential and/or > proprietary information, > and may be used only by the person or entity to which it is addressed. If > the reader of this > e-mail is not the intended recipient or his or her authorized agent, the > reader is hereby > notified that any dissemination, distribution or copying of this e-mail is > prohibited. If you > have received this e-mail in error, please notify the sender by replying to > this message > and delete this e-mail immediately. > >


Back to: Top of message | Previous page | Main SAS-L page