Date: Mon, 17 Jun 2002 13:40:27 -0400
Reply-To: Roger Lustig <rlustig@CBDCREDIT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Roger Lustig <rlustig@CBDCREDIT.COM>
Organization: Creative Business Decisions, Inc.
Subject: Re: one observation per ID
Content-Type: text/plain; charset=us-ascii; format=flowed
Meg:
It's much simpler than you have it! Once you've set up your array, you can use
indexes to assign the diagnosis to the proper array element.
You don't even need a loop, except to initialize. Just say:
data multiple_dx;
set disease_diagnoses;
by id;
retain diagnosis1-diagnosis10;
array multiDX {10} $ diagnosis1-diagnosis10;
if first.ID then do;
seq = 0;
do I=1 to 10;
multiDX (I) = '';
end;
end;
seq + 1;
if seq <= 10 then multiDX(seq) = diagnosis;
if last.ID then output;
run;
Now, to make things even slicker, you can use a DOW-loop:
data multiple_dx (keep=diagnosis1-diagnosis10 ID);
array multiDX {10} $ diagnosis1-diagnosis10;
do seq = 1 by 1 until (last.id);
set disease_diagnoses;
by ID;
if seq <= 10 then multiDX(seq) = diagnosis;
end;
output;
run;
No retains, no initializing.
Roger
meg A wrote:
> Hello,
>
> I have a data set that has multiple diagnosis per ID. I want to
> change the data to one line per ID. My code
> is not working for some reason. Any ideas or easier ways to do it?
> Attached is the log.
>
> Thanks!!!
> Meg
>
>
>
> 975
> 976 data multiple_dx;
> 977 set disease_diagnoses;
> 978 by id;
> 979
> 980 retain diagnosis1-diagnosis10;
> 981
> 982 if first.id then seq=0;
> 983 seq+1;
> 984
> 985 array multiDX {10} $ diagnosis1-diagnosis10;
> 986
> 987 if first.id then do ii=10;
> 988 multiDX {ii}=' ';
> 989 end;
> 990
> 991 if seq=1 then multiDX {ii}=diagnosis;
> 992 if seq=2 then multiDX {ii}=diagnosis;
> 993 if seq=3 then multiDX {ii}=diagnosis;
> 994 if seq=4 then multiDX {ii}=diagnosis;
> 995 if seq=5 then multiDX {ii}=diagnosis;
> 996 if seq=6 then multiDX {ii}=diagnosis;
> 997 if seq=7 then multiDX {ii}=diagnosis;
> 998 if seq=8 then multiDX {ii}=diagnosis;
> 999 if seq=9 then multiDX {ii}=diagnosis;
> 1000 if seq=10 then multiDX {ii}=diagnosis;
> 1001
> 1002 if last.id then output;
> 1003
> 1004 keep id diagnosis1-diagnosis10;
> 1005
> 1006 run;
>
> ERROR: Array subscript out of range at line 992 column 15.
> ID=1 DIAGNOSIS=V72.3 FIRST.ID=0 LAST.ID=0 diagnosis1=
> diagnosis2= diagnosis3= diagnosis4= diagnosis5= diagnosis6=
> diagnosis7= diagnosis8=
> diagnosis9= diagnosis10=V65.40 seq=2 ii=. _ERROR_=1 _N_=2
> NOTE: The SAS System stopped processing this step because of errors.
> NOTE: There were 3 observations read from the dataset
> WORK.DISEASE_DIAGNOSES.
> WARNING: The data set WORK.MULTIPLE_DX may be incomplete. When this
> step was stopped there
> were 0 observations and 11 variables.
> WARNING: Data set WORK.MULTIPLE_DX was not replaced because this step
> was stopped.
> NOTE: DATA statement used:
> real time 0.03 seconds
> cpu time 0.03 seconds
>
>
> 1007
> 1008 proc print data=multiple_dx (obs=200);
> 1009 run;
>
> NOTE: No observations in data set WORK.MULTIPLE_DX.
> NOTE: PROCEDURE PRINT used:
> real time 0.01 seconds
> cpu time 0.01 seconds
>