Date: Thu, 22 Jun 2006 09:52:19 -0400
Reply-To: "Kevin F. Spratt" <Kevin.F.Spratt@DARTMOUTH.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Kevin F. Spratt" <Kevin.F.Spratt@DARTMOUTH.EDU>
Subject: Data Restructuring Question (Revised).
Content-Type: text/plain; charset="us-ascii"; format=flowed
I want to thank the list for the useful comments.
I tried the suggestion from Mark Terjeson and got what I wanted, or I
thought so until I read the comments from Howard Schreier, Ian Whitlock and
others and realized I didn't provide an example that is sufficiently general
to the situations that I might observe. Thus, I have modified my example
and changed the structure to what I think would allow the structures to remain
consistent if one wished to move between them.
The question remains, however, how to move from my original structure 1 or
to my newly modified structure 3:
I have data structured like this:
1 d * Note a duplicate dx within an id
4 i * Note a duplicate dx within an id (there could be more than one
With the possibility of repeated dx within an ID, structure 1 could be
modified to structure 1a as:
id dx count
1 a 1
1 b 1
1 c 1
1 d 2
2 d 1
2 e 1
3 g 1
4 h 1
4 i 2
4 b 1
In other words, multiple records for a patient with various diagnostic codes
With help from the list in the past, I restructured these data to have one
record per patient
id dx1 dx2 dx3 dx4 dx5
1 a b c d d
2 d e
4 h i b i
From either structure 1, 1a, or 2 above, I would like to have the data
structures as follows:
id dx_a dx_b dx_c dc_d dx_e dx_g dx_h dx_i
1 1 1 1 2 0 0 0 0
2 0 0 0 1 1 0 0 0 note the 2 indicates 2
dx within a person and could be more.
3 0 0 0 0 0 1 0 0 than 2
4 0 1 0 0 0 0 1 2
In case it makes a difference, in the real data, the dx fields are
character but not all
the same length.
Coding efficiency is an issue as I have 1,000,000 records and there are
many more than 8 different
I am running SAS 9.1 on the XP platform.
As always, any suggestions welcome and I hope the above information is
It turns out that in my original data there were no duplicate dx within id
Mark's solution worked for me. However, I have other data sets where there
will be duplicates
within id and so the more generalized approach for restructuring is important.
Kevin F. Spratt, Ph.D.
Department of Orthopaedic Surgery
Dartmouth Medical School
One Medical Center Drive
Lebanon, NH USA 03756
(603) 653-6012 (voice)
(603) 653-6013 (fax)
(603) 252-5922 (cell)