LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2007, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 25 Jan 2007 19:01:00 -0500
Reply-To:     Arthur Tabachneck <art297@NETSCAPE.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Arthur Tabachneck <art297@NETSCAPE.NET>
Subject:      Re: Coding HAART in a Sample of HIV Patients: Very Difficult
              Sorting/Sequencing Problem
Comments: To: Paul Miller <pmiller@OHTN.ON.CA>

Paul,

Since lines 1 thru 7 of your data have complete information, while rows 8 and 9 don't, I don't understand your criteria for linking row 7 with rows' 8 and 9.

Art ---------- On Thu, 25 Jan 2007 11:03:08 -0500, Paul Miller <pmiller@OHTN.ON.CA> wrote:

>Hello Everyone, > > > >I've been struggling for some time now with what appears to be a very >difficult sorting/sequencing task. In fact, I find that even explaining >the problem so people can understand it is sometimes difficult. I've >pasted some sample syntax below. The syntax is designed to code >multi-drug Highly Active Antiretroviral Therapy (HAART) regimens in a >sample of HIV patients. The CHANGES dataset sequences some individual >antiretroviral medications for a single patient. DRUG_CLASS in the >dataset indicates what type of antiretroviral the patient was taking. >DATE indicates the date on which the patient started or stopped taking >the drug and is missing where this is unknown. MIN_DATE indicates the >earliest possible value of DATE and MAX_DATE indicates that latest >possible value of DATE. MID_DATE is the midpoint between MIN_DATE and >MAX_DATE. Finally, CHANGE indicates whether the patient started or >stopped taking a drug (1 = start, -1 = stop). > > > >The syntax defines HAART as: > > > >(NRTI >= 3 AND NNRTI=0 AND PI=0) OR > >(NRTI >= 2 AND (NNRTI >= 1 OR PI >= 1)) OR > >(NRTI = 1 AND NNRTI >= 1 AND PI >= 1). > > > >On the whole, the syntax works pretty well but it doesn't always >sequence the drugs the way I want it to. In this case, it correctly >determines the first 2 regimens using rows 1 - 6 of the CHANGES dataset. >Unfortunately though, it doesn't sequence the drugs the way I would like >it to starting with row 7. > > > >By the time the program reaches row 7 in the CHANGES dataset, the >patient is taking an NRTI and an NNRTI. The last three drugs are all >starts but their order is indeterminate. Based on the values for DATE, >MIN_DATE, and MAX_DATE, it is possible that any one of these 3 drugs >could have come next in the sequence. > > > >My current default is to assume that the next drug of the 3 is the one >with the earliest MID_DATE value and my data are sorted accordingly. In >this case though, this default is likely to result in an incorrect >sequencing. As I said earlier, the patient is taking an NRTI and an >NNRTI by the time we get to row 7 of the CHANGES dataset. Thus, I would >be inclined to sequence one of the PI in rows 8 and 9 next and not the >NNRTI in row 7 because the addition of a PI to an NRTI and an NNRTI will >create a new HAART regimen whereas the addition of another NNRTI will >not. I would specifically pick the PI in row 8 because it has an earlier >MID value than the PI in Row 9. > > > >Is there any way to get SAS to recognize that the last 3 drugs are >indeterminate and then to sequence the drugs based on the criteria that >I've just described? > > > >Thanks, > > > >Paul > >Paul J. Miller, Ph.D. >Research Scientist and Statistician >Ontario HIV Treatment Network >1300 Yonge St., Suite 308 >Toronto, Ontario M4T 1X3 >Phone: (416) 642-6486 ext 232 >Fax: (416) 640-4245 > > > >DATA CHANGES; > > INPUT SITE_ID DRUG_CLASS $ DATE :MMDDYY. MIN_DATE :MMDDYY. >MAX_DATE :MMDDYY. MID_DATE :MMDDYY. CHANGE; > > FORMAT DATE MIN_DATE MAX_DATE MID_DATE MMDDYY8.; > > DATALINES; > > 1 PI 4/21/1998 4/21/1998 >4/21/1998 4/21/1998 1 > > 1 NNRTI 4/21/1998 4/21/1998 >4/21/1998 4/21/1998 1 > > 1 NRTI 4/21/1998 4/21/1998 >4/21/1998 4/21/1998 1 > > 1 NNRTI 4/21/1998 4/21/1998 >4/21/1998 4/21/1998 1 > > 1 PI 12/7/1998 12/7/1998 >12/7/1998 12/7/1998 -1 > > 1 NNRTI 12/7/1998 12/7/1998 >12/7/1998 12/7/1998 -1 > > 1 NNRTI 1/29/1999 1/29/1999 >1/29/1999 1/29/1999 1 > > 1 PI . 1/1/1999 >6/5/1999 3/19/1999 1 > > 1 PI . 1/1/1999 >8/5/1999 4/19/1999 1 > >; > >RUN; > > > >/*ROLL UP TO 1 OBSERVATION PER ID PER DAY AND COMPUTE HAART*/ > > > >DATA CUMULATIVE (DROP=DRUG_CLASS CHANGE STOP_DATE > > RENAME=(DATE=START_DATE >MIN_DATE=MIN_START MAX_DATE=MAX_START)) > > STOP_DATES (KEEP=SITE_ID REGIMEN STOP_DATE MIN_DATE >MAX_DATE > > RENAME=(MIN_DATE=MIN_STOP >MAX_DATE=MAX_STOP)); > > RETAIN SITE_ID REGIMEN; > > SET CHANGES; > > BY SITE_ID MID_DATE; > > > > IF FIRST.SITE_ID THEN DO; > > REGIMEN = 0; > > NRTI = 0; > > NNRTI = 0; > > PI = 0; > > END; > > > > IF DRUG_CLASS = 'NRTI' THEN NRTI + CHANGE; > > ELSE IF DRUG_CLASS = 'NNRTI' THEN NNRTI + CHANGE; > > ELSE IF DRUG_CLASS = 'PI' THEN PI + CHANGE; > > > > IF LAST.MID_DATE THEN DO; > > STOP_DATE = DATE; > > > > IF REGIMEN THEN OUTPUT STOP_DATES; > > REGIMEN + 1; > > > > ALLDRUGS = NNRTI + NRTI + PI; > > HAART = (NRTI >= 3 AND NNRTI=0 AND PI=0) OR > > (NRTI >= 2 AND (NNRTI >= 1 OR PI >= 1)) OR > > (NRTI = 1 AND NNRTI >= 1 AND PI >= 1); > > OUTPUT CUMULATIVE; > > END; > > > > FORMAT STOP_DATE MMDDYY10.; > >RUN; > > > >DATA REGIMENS (DROP=REGIMEN MID_DATE); > > RETAIN SITE_ID START_DATE STOP_DATE MIN_START MAX_START >MIN_STOP MAX_STOP > > DURATION MIN_DURATION MAX_DURATION; > > MERGE CUMULATIVE STOP_DATES; > > BY SITE_ID REGIMEN; > > > > IF START_DATE NE . AND STOP_DATE NE . THEN DO; > > DURATION = STOP_DATE - START_DATE; > > MIN_DURATION = DURATION; > > MAX_DURATION = DURATION; > > END; > > > > ELSE IF START_DATE NE . AND STOP_DATE = . THEN DO; > > IF MIN_STOP < START_DATE AND MIN_STOP NE . THEN DO; > > MIN_STOP = START_DATE; > > END; > > DURATION = .; > > MIN_DURATION = MIN_STOP - START_DATE; > > MAX_DURATION = MAX_STOP - START_DATE; > > END; > > > > ELSE IF START_DATE = . AND STOP_DATE NE . THEN DO; > > IF MAX_START > STOP_DATE AND STOP_DATE NE . THEN DO; > > MAX_START = STOP_DATE; > > END; > > DURATION = .; > > MIN_DURATION = STOP_DATE - MAX_START; > > MAX_DURATION = STOP_DATE - MIN_START; > > END; > > > > ELSE IF START_DATE = . AND STOP_DATE = . THEN DO; > > DURATION = .; > > IF MIN_STOP = . AND MAX_STOP = . THEN DO; > > MIN_DURATION = .; > > MAX_DURATION = .; > > END; > > IF MAX_START > MAX_STOP AND MAX_STOP NE . THEN DO; > > MAX_START = MAX_STOP; > > END; > > IF MAX_START > MIN_STOP AND MIN_STOP NE . THEN DO; > > MIN_DURATION = 0; > > MAX_DURATION = MAX_STOP - MIN_START; > > END; > > IF MAX_START <= MIN_STOP THEN DO; > > MIN_DURATION = MIN_STOP - MAX_START; > > MAX_DURATION = MAX_STOP - MIN_START; > > END; > > END; > > > > IF ALLDRUGS; > >RUN; > > > >PROC DELETE DATA=CHANGES CUMULATIVE STOP_DATES; > >RUN; > >


Back to: Top of message | Previous page | Main SAS-L page