Date: Thu, 14 May 2009 15:48:33 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Restructure dataset
In-Reply-To: <FE10F31634E7F34B87AA143D5960854103890324@EX-CMS01.westat.com>
Content-Type: text/plain; charset="us-ascii"
A noted SAS-L participant, who also happens to be the author of an excellent book on SQL for SAS programmers, has kindly pointed out to me privately a blind spot in my UNION solution in this thread. Assuming that the actual data have no attribute (such as a time stamp) that would distinguish two rows with the same typedose and the same amt, the ALL option (UNION CORR ALL) would be not only more efficient but also necessary as well. The UNION set operator reduces indistinct rows to a single row.
I appreciate the feedback and the opportunity to correct the solution. Good catch
Adding the ALL option would seem a default measure. It leads to a correct result and a more efficient program. Even so, I would not recommend leaving two identical rows in a SQL table or a SAS dataset. I typically omit the ALL option and rely on another attribute to distinguish one row from another. The UNION operator will then excise duplicated rows when needed.
But in no way does that excuse my lapse of logic. I stand corrected.
S
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Sigurd Hermansen
Sent: Thursday, May 14, 2009 2:19 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Restructure dataset
Good example for the SAS SQL for Database Programmers course that I'm developing ...
data test;
input Placebo Aspirin Tylenol ;
cards;
40 40 32
38 41 37
34 47 29
44 37 34
50 62 22
59 34 28
;
run;
proc sql;
create table Facts as
(select 'Placebo' as typeDose,Placebo as amt from test)
union corr
(select 'Aspirin' as typeDose,Aspirin as amt from test)
union corr
(select 'Tylenol' as typeDose,Tylenol as amt from test)
;
quit;
S
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of tablenc
Sent: Thursday, May 14, 2009 1:35 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Restructure dataset
I'm still learning about SAS, so sorry if the terminology I am using is off. I have this SAS dataset.
40 40 32
38 41 37
34 47 29
44 37 34
50 62 22
59 34 28
The first column represents a placebo, the second one aspirin, and third is tylenol. I want to restructure this using array and the DO sequence.
Is it possible for my dataset to look like this?
The first column should represent 'type of dose' and the second column should represent amount.
Placebo 40
placebo 38
placebo 34
and so on
Aspirin 40
Aspirin 41
and so on
Tylenol 32
Tylenol 37
and so on