LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2007, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Sun, 25 Nov 2007 21:20:46 -0500
Reply-To:   "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject:   Re: Data manipulation question

On Wed, 21 Nov 2007 06:29:09 +1000, Craig Hansen <c.hansen@UQ.EDU.AU> wrote:

>Hi Everyone > >I have a dataset of 60,000 pathology reports and I need to pull out >patients that had a biopsy for their first visit and then an excision on >the same skin lesion for their subsequent visit. For the remainder >patients that were not pulled out of the dataset I need them to be in >another dataset. In the example data below, take note how patient with >patid=2 has two biopsies on their first visit (03MAR06) but only one of >the biopsies had a subsequent excision (10MAR06). > >I would appreciate any ideas on how I can do this. Thanks in advance. > >Here is an example of the data > >patid reportid date type site >1 1 02FEB06 BIOPSY ARM >1 1 06FEB06 BIOPSY ARM >2 2 03MAR06 BIOPSY LEG >2 2 03MAR06 BIOPSY HEAD >2 3 10MAR06 EXCISION LEG >3 4 11MAR06 EXCISION ARM >4 5 06APR06 BIOPSY HEAD >4 6 10APR06 EXCISION HEAD > >Here is how I would like the two final datasets > >Dataset one (patients that did not have a biopsy and subsequent >excision) > >Patid reportid date type site >1 1 02FEB06 BIOPSY ARM >1 1 06FEB06 BIOPSY ARM >2 2 03MAR06 BIOPSY HEAD >3 4 11MAR06 EXCISION ARM > >Dataset two (patients that did have a biopsy and subsequent excision) > >Patid reportid date type site >2 2 03MAR06 BIOPSY LEG >2 3 10MAR06 EXCISION LEG >4 5 06APR06 BIOPSY HEAD >4 6 10APR06 EXCISION HEAD

data example; input patid reportid @8 date date7. @16 type$ site$; cards; 1 1 02FEB06 BIOPSY ARM 1 1 06FEB06 BIOPSY ARM 2 2 03MAR06 BIOPSY LEG 2 2 03MAR06 BIOPSY HEAD 2 3 10MAR06 EXCISION LEG 3 4 11MAR06 EXCISION ARM 4 5 06APR06 BIOPSY HEAD 4 6 10APR06 EXCISION HEAD ;

First, sort so that each SITE can be considered separately:

proc sort data=example out=sites; by patid site date; run;

Now a self-interleave can first categorize the cases and then split the table accordingly:

data one two; set sites(in=preview) sites; by patid site; retain biopsy_date both_types ; drop biopsy_date both_types ; if first.site then call missing (biopsy_date,both_types); if preview then do; if type='BIOPSY' then biopsy_date = date; if type='EXCISION' and date > biopsy_date > . then both_types = 1; end; else do; if both_types then output two; else output one; end; run;


Back to: Top of message | Previous page | Main SAS-L page