Date: Mon, 17 Jun 2002 19:35:37 +0200
Reply-To: Jim Groeneveld <J.Groeneveld@ITGROUPS.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jim Groeneveld <J.Groeneveld@ITGROUPS.COM>
Subject: Re: one observation per ID
Content-Type: text/plain
Dear Meg,
Try the (*untested*) code below:
data multiple_dx;
set disease_diagnoses;
by id; * make sure it is sorted accordingly;
array multiDX {10} $ diagnosis1-diagnosis10;
retain diagnosis1-diagnosis10;
if first.id then
do;
seq=0;
do _I_ = 1 to 10;
multiDX {_I_}=' '; * if character (. if numeric); * reset to missing;
end;
end;
seq+1;
multiDX {seq}=diagnosis;
if last.id then output;
keep id diagnosis1-diagnosis10;
run;
Regards - Jim.
--
Y. (Jim) Groeneveld, MSc IMRO TRAMARKO tel. +31 412 407 070
senior statist./data man. P.O. Box 1 fax. +31 412 407 080
J.Groeneveld@ITGroups.com 5350 AA BERGHEM, NL www.imrotramarko.com
Computers aren't there to be kept busy, but to keep us busy.
Notice of confidentiality: this e-mail may contain confidential information
intended for the addressed recipient only.
If you have received this e-mail in error please delete this e-mail and
please notify the sender so that proper delivery
can be arranged.
> -----Original Message-----
> From: meg A [SMTP:napu1975@NETSCAPE.NET]
> Sent: Monday, June 17, 2002 7:23 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: one observation per ID
>
> Hello,
>
> I have a data set that has multiple diagnosis per ID. I want to
> change the data to one line per ID. My code
> is not working for some reason. Any ideas or easier ways to do it?
> Attached is the log.
>
> Thanks!!!
> Meg
>
>
>
> 975
> 976 data multiple_dx;
> 977 set disease_diagnoses;
> 978 by id;
> 979
> 980 retain diagnosis1-diagnosis10;
> 981
> 982 if first.id then seq=0;
> 983 seq+1;
> 984
> 985 array multiDX {10} $ diagnosis1-diagnosis10;
> 986
> 987 if first.id then do ii=10;
> 988 multiDX {ii}=' ';
> 989 end;
> 990
> 991 if seq=1 then multiDX {ii}=diagnosis;
> 992 if seq=2 then multiDX {ii}=diagnosis;
> 993 if seq=3 then multiDX {ii}=diagnosis;
> 994 if seq=4 then multiDX {ii}=diagnosis;
> 995 if seq=5 then multiDX {ii}=diagnosis;
> 996 if seq=6 then multiDX {ii}=diagnosis;
> 997 if seq=7 then multiDX {ii}=diagnosis;
> 998 if seq=8 then multiDX {ii}=diagnosis;
> 999 if seq=9 then multiDX {ii}=diagnosis;
> 1000 if seq=10 then multiDX {ii}=diagnosis;
> 1001
> 1002 if last.id then output;
> 1003
> 1004 keep id diagnosis1-diagnosis10;
> 1005
> 1006 run;
>
> ERROR: Array subscript out of range at line 992 column 15.
> ID=1 DIAGNOSIS=V72.3 FIRST.ID=0 LAST.ID=0 diagnosis1=
> diagnosis2= diagnosis3= diagnosis4= diagnosis5= diagnosis6=
> diagnosis7= diagnosis8=
> diagnosis9= diagnosis10=V65.40 seq=2 ii=. _ERROR_=1 _N_=2
> NOTE: The SAS System stopped processing this step because of errors.
> NOTE: There were 3 observations read from the dataset
> WORK.DISEASE_DIAGNOSES.
> WARNING: The data set WORK.MULTIPLE_DX may be incomplete. When this
> step was stopped there
> were 0 observations and 11 variables.
> WARNING: Data set WORK.MULTIPLE_DX was not replaced because this step
> was stopped.
> NOTE: DATA statement used:
> real time 0.03 seconds
> cpu time 0.03 seconds
>
>
> 1007
> 1008 proc print data=multiple_dx (obs=200);
> 1009 run;
>
> NOTE: No observations in data set WORK.MULTIPLE_DX.
> NOTE: PROCEDURE PRINT used:
> real time 0.01 seconds
> cpu time 0.01 seconds