LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2009, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 19 Jan 2009 20:29:47 -0600
Reply-To:     Joe Matise <snoopy369@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Joe Matise <snoopy369@GMAIL.COM>
Subject:      Re: collapse data while retain certain value: seek a better
              solution than mine
Comments: To: Jerry <i89rt5@gmail.com>
In-Reply-To:  <200901200212.n0JBkfth027956@malibu.cc.uga.edu>
Content-Type: text/plain; charset=ISO-8859-1

Use the MIN option in proc summary to get the first admdate per patientid. like so.

data sample; input patientID:$1. lengthStay:3. admDate:mmddyy10. ; format admDate mmddyy10.; datalines; A 3 08/18/2007 A 5 02/27/2005 A 1 09/08/2006 B 2 01/05/2005 B 3 05/31/2003 C 56 03/19/2006 C 8 05/26/2005 C 2 12/25/2004 D 9 05/15/2005 ; run;

proc summary data=sample nway; class patientID; var lengthStay admdate; output out=stay(keep=patientid lengthstay firststay) sum(lengthstay)= min(admdate)=firststay; run;

-Joe

On Mon, Jan 19, 2009 at 8:12 PM, Jerry <i89rt5@gmail.com> wrote:

> /* > Hi, > > I have a sample data (see below), each record of which is a hospital > admission with patient ID (patientID) and hospitalization days(lengthStay). > Each patient could have multiple admissions, differentiated by admission > dates(admDate). My purpose is to collapse the sample data by patient ID to > get the total hospitalization days for each patient, while retain the > earliest admission date for each patient. In other words, the resulting > data > is at unique patient level, with that each unique patient has the total > hospitalization days from his/her multiple admissions, and the earliest > admission date among his/her multiple admissions, if s/he does have > multiple > admissions. > > It took me 3 steps (see below) to achieve this purpose. I know they look > stupid and are inefficient but I can't help! :-) > > I bet there must be many MUCH BETTER ways to get the same result, probably > all 3 steps can be combined/reduced to 1 step. Could anyone please show me > how to do so with succinct codes and/or better running time efficiency? > > Your input and time is greatly appreciated! > > Jerry > */ > > /*create sample data*/ > data sample; > input patientID:$1. lengthStay:3. admDate:mmddyy10. ; > format admDate mmddyy10.; > datalines; > A 3 08/18/2007 > A 5 02/27/2005 > A 1 09/08/2006 > B 2 01/05/2005 > B 3 05/31/2003 > C 56 03/19/2006 > C 8 05/26/2005 > C 2 12/25/2004 > D 9 05/15/2005 > ; > run; > > /*step 1: > summerize length of stay by unique patient id*/ > > proc summary data=sample nway; > class patientID; > output > out=stay(drop=_:) > sum(lengthStay)= > /autoname; > run; > > /* step 2: > get the earliest admission date for each unique patient*/ > > /*first sort the data by patient id ad admDate*/ > proc sort data=sample(drop=lengthStay) out=date; > by patientID admDate; > run; > > /*then for each unique patient, retain his/her earliest admission date, use > "NODUPKEY' option*/ > proc sort data=date nodupkey; > by patientID; > run; > > /*step 3: > comebine data "stay" and "date" by patient id so that the resulting data is > at unique patient level, > with each unique patient has the total hospitalization days and the > earliest > admission date. > > The resulting final data should be: > > patientID lengthStay_sum admDate > > A 9 02/27/2005 > B 5 05/31/2003 > C 66 12/25/2004 > D 9 05/15/2005 > */ > > data final; > merge stay date; > by patientID; > run; >


Back to: Top of message | Previous page | Main SAS-L page