Date: Thu, 23 Jun 2005 16:30:18 -0400
Reply-To: diskin@alum.rpi.edu
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dennis Diskin <ddiskin@GMAIL.COM>
Subject: Re: Request for kibbitzing: time period gap detection code
In-Reply-To: <8AD8F86B3312F24CB432CEDDA71889F21D56E5@ex06.GHCMASTER.GHC.ORG>
Content-Type: text/plain; charset=ISO-8859-1
Roy,
A well posed question with good sample data. Here's an approach using an
array: I'm not sure how you consider the zero month. I'm ignoring it, but
you could adjust the code however.
HTH, Dennis Diskin
data insufficientlyenrolled(keep=id indexdate);
set enroll_periods ;
by ID INDEXDATE ENROLLDATE;;
array flags (-&PreIndexEnrolledMonths:&PostIndexEnrolledMonths) _temporary_;
i = intck("MONTH", indexdate, EnrollDate) ;
if -&PreIndexEnrolledMonths <= i <= &PostIndexEnrolledMonths then
flags(i) = 1;
if last.indexdate;
gap = 0;
/* check for pre gap */
do i = -&PreIndexEnrolledMonths to -1 until(gap gt &PreIndexGapTolerance);
if flags(i) eq 1 then
gap = 0;
else
gap = gap + 1;
end;
if gap le &PreIndexGapTolerance then
do;
gap = 0;
do i = 1 to &PostIndexEnrolledMonths until(gap gt &PostIndexGapTolerance);
if flags(i) eq 1 then
gap = 0;
else
gap = gap + 1;
end;
if gap le &PostIndexGapTolerance then gap = 0;
end;
if gap ne 0 then output;
do i = -&PreIndexEnrolledMonths to &PostIndexEnrolledMonths;
flags(i) = 0;
end;
run;
On 6/22/05, Pardee, Roy <pardee.r@ghc.org> wrote:
>
> Greetings all,
>
> Hopefully this isn't too much like asking you all to do unpaid
> consulting. I'm confident that it will be ignored if it is. 8^)
>
> I'm trying to write a macro that will evaluate HMO enrollment data, and
> spit out the IDs of people who are insufficiently enrolled during a
> particular time period. I'd love to have feedback on how I might
> improve the validity/readability/efficiency of the code below.
>
> For my purposes, the 'sufficiency' of enrollment is defined like so.
>
> Each person has an IndexDate--a date on which something interesting
> happened to them (usually it's the date they were first diagnosed with
> some condition of interest).
>
> We need for each person to be enrolled for some user-specified number of
> months prior to their index date, and a different, user-specified number
> of months after their index date. The user will pass these numbers into
> my macro in parameters called 'PreIndexEnrolledMonths' and
> 'PostIndexEnrolledMonths'.
>
> On either side of their index date, we want to ignore gaps of up to a
> user-specified number of months. The ignorable gap sizes are passed
> into the macro in parameters 'PreIndexGapTolerance' and
> 'PostIndexGapTolerance'.
>
> The data are arranged like so:
>
> ID EnrollDate IndexDate
> ---------------------------
> 001 11/01/2000 11/15/2002
> 001 12/01/2000 11/15/2002
> 001 01/01/2001 11/15/2002
> 002 11/01/2001 08/15/2002
> 002 12/01/2001 08/15/2002
>
> So--there's one record per enrolled month, and each person's IndexDate
> is repeated on each of their records. (Please note that I'm pretty much
> stuck w/this structure, unfortunately.)
>
> One efficiency I'd like to achieve if possible is to somehow, stop
> processing a given ID's input records as soon as they are determined to
> have an intolerable gap. Right now the code touches every record, even
> if we know right from the first that this is a person whose ID belongs
> in the output dataset.
>
> All I really want in the output dataset (dset 'insufficiently_enrolled'
> in the below) is unique IDs. I'm keeping the other vars for now just
> for validity testing purposes.
>
> A thousand thanks in advance!
>
> -Roy
>
>
|