LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2010, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 20 Oct 2010 02:10:49 +0000
Reply-To:     "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Keintz, H. Mark" <mkeintz@WHARTON.UPENN.EDU>
Subject:      Re: Hash or No-Hash solution to my problem?
Comments: To: Raghav Adimulam <raghavadimulam@WESTAT.COM>
In-Reply-To:  <201010192242.o9JH5kER012853@willow.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"

Raghav:

I first thought of the array approach, but because you mention that you have a number of variables in addition to MRN and DATEDIFF, I suggest instead the double DOW approach, as here.

data want (drop=cutoff); do until (last.mrn); set have; by mrn; end; cutoff=datediff+365; ** Last DATEDIFF is always the minimum **; do until (last.mrn); set have; by mrn; if datediff< cutoff then output; end; run;

As to hash, I don't see any special value to it in this case (your data is already grouped by MRN, and fast lookup is not the issue here), and I personally prefer to avoid the extra coding the hash technique would require.

Although the PROC SQL suggested by Art would work, remember that SQL does not guarantee the output table has records in the same order as the input table. Yes Art used "group by MRN", which implies an "ORDER by MRN", the records within MRN can be re-ordered (and were when I ran the PROC SQL code). So in your case, add an "ORDER by MRN, datediff desc". But even that is insufficient in the case of records with duplicate sort keys. AFAIK, there is no "order by" equivalent to the "equals" option in proc sort.

Regards, Mark

Regards, Mark

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > Raghav Adimulam > Sent: Tuesday, October 19, 2010 6:43 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Hash or No-Hash solution to my problem? > > Hello SAS-Lers, > > Here is the simplified version of my problem, i.e. without the > unnecessary > fields and misc data. I have a dataset with 2 fields that is sorted on > MRN > and DateDiff as shown below. > > MRN DateDiff > 001 500 > 001 400 > 001 300 > 001 200 > 001 100 > 001 0 > > 002 500 > 002 400 > 002 300 > 002 200 > 002 100 > > 003 700 > 003 300 > 003 200 > 003 50 > > How can I accomplish a task so that I can keep only the records that > fall > within 365 days of the lowest DateDiff for that MRN?? > > In other words, MRN 001 has qualifying records <=0+365 > MRN 002 has qualifying records <=100+365 (=465) > MRN 003 has qualifying records <=50+365 (=415) > > Thus the final dataset should be.. > > MRN DateDiff > 001 300 > 001 200 > 001 100 > 001 0 > > 002 400 > 002 300 > 002 200 > 002 100 > > 003 300 > 003 200 > 003 50 > > Thanks for your answers in advance! > > -Raghav


Back to: Top of message | Previous page | Main SAS-L page