LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 4 Mar 2011 14:40:25 -0500
Reply-To:     bbser 2009 <bbser2009@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         bbser 2009 <bbser2009@GMAIL.COM>
Subject:      Re: Confusing Dedup code
Comments: To: Irin later <irinfigvam@yahoo.com>
In-Reply-To:  <760417.91091.qm@web39407.mail.mud.yahoo.com>
Content-Type: text/plain; charset="utf-8"

Irin

Sorry for the misunderstanding. Below I made a little modification to my original code. Hopefully it is ok now.

Thanks. -Max

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

data have; infile cards; input id $3. srv_date : mmddyy10. disch_date : mmddyy10. timeframe tag1 tag2; cards; 12a 07/15/2010 07/05/2010 10 1 1 12a 08/31/2010 08/01/2010 30 0 1 12a 03/10/2010 02/16/2010 24 0 1 12a 06/24/2010 06/10/2010 14 1 1 12a 05/30/2010 03/30/2010 60 0 0 12a 09/25/2010 09/05/2010 20 0 1 12a 08/31/2010 08/01/2010 30 0 1 12a 03/10/2010 02/16/2010 24 0 1 12a 05/30/2010 03/30/2010 60 0 0 12a 09/25/2010 09/05/2010 20 0 1 12a 09/30/2010 08/01/2010 60 0 0 12a 03/14/2010 02/10/2010 34 0 0 12a 05/30/2010 03/25/2010 55 0 0 12a 10/25/2010 09/15/2010 40 0 0 13a 09/25/2010 09/05/2010 20 0 1 13a 08/31/2010 08/01/2010 30 0 1 13a 03/10/2010 02/16/2010 24 0 1 13a 05/30/2010 03/30/2010 60 0 0 13a 09/25/2010 09/05/2010 20 0 1 13a 09/30/2010 08/01/2010 60 0 0 13a 03/14/2010 02/10/2010 34 0 0 13a 05/30/2010 03/25/2010 55 0 0 13a 10/25/2010 09/15/2010 40 0 0 14a 09/30/2010 08/01/2010 60 0 0 14a 03/14/2010 02/10/2010 34 0 0 14a 05/30/2010 03/25/2010 55 0 0 14a 10/25/2010 09/15/2010 40 0 0 ; data new; set have; if tag1=1 then indicator=0; else if tag2=1 then indicator=1; else indicator=2; run;

proc sort data=new out=new_sort; by id indicator timeframe; run;

data need; set new_sort; by id indicator timeframe; if first.id and first.indicator then output; run;

proc print; run;

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

From: Irin later [mailto:irinfigvam@yahoo.com] Sent: March-04-11 1:59 PM To: SAS-L@LISTSERV.UGA.EDU; bbser 2009 Subject: Re: Confusing Dedup code

Max,

The output actually contains the records for each scenario while I need just one record. If tag1=1 and smallest timeframe met than we just get rid of others within the ID group.

However if it doesn't meet then tag2=1 +smallest takes it's turn. If meets then get rid of the rest

If both tags=1 then just smallest timeframe criteria takes place and we get rid of the rest.

Final result should contain just one record ("the best")

Sorry for the confusion.

--- On Fri, 3/4/11, bbser 2009 <bbser2009@GMAIL.COM> wrote:

From: bbser 2009 <bbser2009@GMAIL.COM> Subject: Re: Confusing Dedup code To: SAS-L@LISTSERV.UGA.EDU Date: Friday, March 4, 2011, 12:56 PM Irin

Below is the whole verified code based on the idea in my last email. Wish to be what you want.

-Max

data have; infile cards; input id $3. srv_date : mmddyy10. disch_date : mmddyy10. timeframe tag1 tag2; cards; 12a 07/15/2010 07/05/2010 10 1 1 12a 08/31/2010 08/01/2010 30 0 1 12a 03/10/2010 02/16/2010 24 0 1 12a 06/24/2010 06/10/2010 14 1 1 12a 05/30/2010 03/30/2010 60 0 0 12a 09/25/2010 09/05/2010 20 0 1 12a 08/31/2010 08/01/2010 30 0 1 12a 03/10/2010 02/16/2010 24 0 1 12a 05/30/2010 03/30/2010 60 0 0 12a 09/25/2010 09/05/2010 20 0 1 12a 09/30/2010 08/01/2010 60 0 0 12a 03/14/2010 02/10/2010 34 0 0 12a 05/30/2010 03/25/2010 55 0 0 12a 10/25/2010 09/15/2010 40 0 0 ; data new; set have; indicator=-1; if tag1=1 then indicator=1; else if tag2=1 then indicator=0; run;

proc sort data=new out=new_sort; by id indicator timeframe; run;

data need; set new_sort; by id indicator timeframe; if first.indicator then output; run;

proc print; run;

-----Original Message----- From: bbser 2009 [mailto:bbser2009@gmail.com] Sent: March-04-11 12:33 PM To: 'Irin later'; 'SAS-L@LISTSERV.UGA.EDU' Subject: RE: Confusing Dedup code

I see. I was thinking of doing it like this:

In first data step, make a new variable to differentiate the three cases:

.... indicator=-1; if tag1=1 then indicator=1; else if tag2=1 then indicator=0; ...

In second data step, just use last. and first. technique like this:

... by id indicator timeframe; ... In this data step, you only output the observation where first.indicator=1.

I guess this should work.

Thanks. -Max

From: Irin later [mailto:irinfigvam@yahoo.com] Sent: March-04-11 11:32 AM To: SAS-L@LISTSERV.UGA.EDU; bbser 2009 Subject: Re: Confusing Dedup code

Max, actually I do concern about values of tag1 and tag2

If tag1=1 I need this record but if there are couple of records w tag1=1 within the same ID group, I do need just those with smallest timeframe.

If tag1=0 I ignore this record and my concern is tag2=1 . within the same ID group, I do need just those with smallest timeframe.

If tag1 and tag2 both are 0 then I have nothing to do but to chose a record with o . However in such a case my concern still would me the smallest timeframe among those records with tag1=0 and tag2=0

--- On Fri, 3/4/11, bbser 2009 <bbser2009@GMAIL.COM> wrote:

From: bbser 2009 <bbser2009@GMAIL.COM> Subject: Re: Confusing Dedup code To: SAS-L@LISTSERV.UGA.EDU Date: Friday, March 4, 2011, 11:23 AM Irin

It seems to me that you always want the smallest timeframe for the same ID regardless of values of tag1 and tag2. Did I misunderstand anything?

Thanks. -Max

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Irin later Sent: March-04-11 10:57 AM To: SAS-L@LISTSERV.UGA.EDU Subject: [SAS-L] Confusing Dedup code

I am confused with a dedup code….. -Among the patients with the same ID and criteria tag1=1, one needs first to keep from records with tag1=1 and with the shortest timeframe between Disch_date and srv_date that follows the hospital discharge. This is the 1st I priority for dedup. -In case tag1=0, the priority would be tag2=1 with the shortest timeframe -In case both tag1=0 and tag2=0 one needs to keep the record with the shortest timeframe

for example: Scenarion #1: id srv_date disch_date timeframe tag1 tag2 12a 07/15/2010 07/05/2010 10 1 1 12a 08/31/2010 08/01/2010 30 0 1 12a 03/10/2010 02/16/2010 24 0 1 12a 06/24/2010 06/10/2010 14 1 1 12a 05/30/2010 03/30/2010 60 0 0 12a 09/25/2010 09/05/2010 20 0 1 I need the following record with tag1=1 and with the shortest timeframe=10 (rather than timeframe=14) id SRV_date disch_date timeFrame tag1 tag2 12a 07/15/2010 07/05/2010 10 1 1

Scenario #2: id srv_date disch_date timeFrame tag1 tag2 12a 08/31/2010 08/01/2010 30 0 1 12a 03/10/2010 02/16/2010 24 0 1 12a 05/30/2010 03/30/2010 60 0 0 12a 09/25/2010 09/05/2010 20 0 1 I need the following record with tag2=1 and the shortest timeframe=20 (rather than timeframe=24 or 30) id srv_date disch_date timeFrame tag1 tag2 12a 09/25/2010 09/05/2010 20 0 1

Scenario #3 (both tag1 and tag2 =0): id srv_date disch_date timeFrame tag1 tag2 12a 09/31/2010 08/01/2010 60 0 0 12a 03/14/2010 02/10/2010 34 0 0 12a 05/30/2010 03/25/2010 55 0 0 12a 10/25/2010 09/15/2010 40 0 0 I need the following record the shortest timeframe=34 id srv_date disch_date timeFrame tag1 tag2 12a 03/14/2010 02/10/2010 34 0 0

Could you, please, give me a hand?

Thank you for advance,

Irin


Back to: Top of message | Previous page | Main SAS-L page