Date: Mon, 7 Jul 2008 15:17:42 -0500
Reply-To: Mary <mlhoward@avalon.net>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mary <mlhoward@AVALON.NET>
Subject: Re: PROC FREQ AND TRANSPOSE
Content-Type: text/plain; charset="iso-8859-1"
The problem is that you haven't created any table with the zeros in it, so Proc Freq doesn't know which items it should use since they never appear in the data. Howard S. had some good suggestions to me about a month ago on how to handle that; here's that discussion so you might be able to draw from it, from May 29th, note he's got two different approaches to the problem.
-Mary
I think I was too focused on SQL. The two CREATE TABLE statements (to
aggregate, then impute the zeros) can be replaced with just:
proc summary data=long completetypes nway;
class q response;
output out=withzeroes(rename=(_freq_=n) drop=_type_);
run;
And, if the end result is for presentation rather than subsequent
processing, the CREATE VIEW statement and the PROC TRANSPOSE which follows
can be replaced with:
proc tabulate data=withzeroes noseps formchar="|-+-+|+|+-+";
class q response;
var n;
table response
,
q='' * n='' * (sum='N' pctsum<response>='PctN') * f=best6.;
run;
Result:
+----------------------------------------------------------------+
| | Q1 | Q2 | Q3 |
| |-------------+-------------+-------------|
| | N | PctN | N | PctN | N | PctN |
|----------------------+------+------+------+------+------+------|
|Response | | | | | | |
|1 | 2| 20| 0| 0| 3| 30|
|2 | 0| 0| 4| 40| 3| 30|
|3 | 2| 20| 1| 10| 1| 10|
|4 | 4| 40| 3| 30| 2| 20|
|5 | 2| 20| 2| 20| 1| 10|
+----------------------------------------------------------------+
In other words, once the given table is normalized, the task can be
completed with a fairly simple PROC SUMMARY and a fairly simple PROC TABULATE.
On Tue, 3 Jun 2008 22:57:04 -0400, Howard Schreier <hs AT dc-sug DOT org>
<schreier.junk.mail@GMAIL.COM> wrote:
>
>If I understand, you start with a data set which can be modeled by:
>
> data have(drop=qnum);
> do ID = 100 to 109;
> array qq(*) Q1-Q3;
> do qnum = 1 to dim(qq);
> qq(qnum) = ceil( 5 * ranuni(123) );
> end;
> output;
> end;
> run;
>
>Result:
>
> Obs ID Q1 Q2 Q3
>
> 1 100 4 2 1
> 2 101 5 2 2
> 3 102 4 2 1
> 4 103 1 4 3
> 5 104 5 2 4
> 6 105 3 3 5
> 7 106 1 5 4
> 8 107 4 4 2
> 9 108 3 5 1
> 10 109 4 4 2
>
>You need this summary:
>
> Obs Response Q1_N Q1_Pct Q2_N Q2_Pct Q3_N Q3_Pct
>
> 1 1 2 20 0 0 3 30
> 2 2 0 0 4 40 3 30
> 3 3 2 20 1 10 1 10
> 4 4 4 40 3 30 2 20
> 5 5 2 20 2 20 1 10
>
>I would first normalize the given data:
>
> proc transpose data=have out=long(rename=(_name_=Q col1=Response) );
> by id;
> var q : ;
> run;
>
>Result:
>
> Obs ID Q Response
>
> 1 100 Q1 4
> 2 100 Q2 2
> 3 100 Q3 1
> 4 101 Q1 5
> 5 101 Q2 2
> 6 101 Q3 2
> 7 102 Q1 4
> 8 102 Q2 2
> 9 102 Q3 1
> 10 103 Q1 1
> 11 103 Q2 4
> 12 103 Q3 3
> 13 104 Q1 5
> 14 104 Q2 2
> 15 104 Q3 4
> 16 105 Q1 3
> 17 105 Q2 3
> 18 105 Q3 5
> 19 106 Q1 1
> 20 106 Q2 5
> 21 106 Q3 4
> 22 107 Q1 4
> 23 107 Q2 4
> 24 107 Q3 2
> 25 108 Q1 3
> 26 108 Q2 5
> 27 108 Q3 1
> 28 109 Q1 4
> 29 109 Q2 4
> 30 109 Q3 2
>
>Next, aggregate:
>
> proc sql;
>
> create table aggregated as
> select q, response, count(*) as N
> from long
> group by q, response;
>
>Result:
>
> Obs Q Response N
>
> 1 Q1 1 2
> 2 Q1 3 2
> 3 Q1 4 4
> 4 Q1 5 2
> 5 Q2 2 4
> 6 Q2 3 1
> 7 Q2 4 3
> 8 Q2 5 2
> 9 Q3 1 3
> 10 Q3 2 3
> 11 Q3 3 1
> 12 Q3 4 2
> 13 Q3 5 1
>
>I find in every project there is inevitably some little semi-critical detail
>that ends up requiring the most complicated code. In this case it is the two
>non-existent intersections in this table. You probably want zeros for them
>in your end result, not missing values. So, impute those zeroes:
>
> create table withzeroes as
> select q, response, coalesce(n,0) as N
> from aggregated
> natural right join
> ( select * from (select distinct q from aggregated)
> cross join
> (select distinct response from aggregated)
> );
>
>Result:
>
> Obs Q Response N
>
> 1 Q1 1 2
> 2 Q1 2 0
> 3 Q1 3 2
> 4 Q1 4 4
> 5 Q1 5 2
> 6 Q2 1 0
> 7 Q2 2 4
> 8 Q2 3 1
> 9 Q2 4 3
> 10 Q2 5 2
> 11 Q3 1 3
> 12 Q3 2 3
> 13 Q3 3 1
> 14 Q3 4 2
> 15 Q3 5 1
>
>Next, expand the table by computing the percents and package everything for
>PROC TRANSPOSE:
>
> create view fortranspose as
> select response, catt(q,'_N') as Stat, n as Value
> from withzeroes
> union all
> select response, catt(q,'_Pct'), 100 * n / sum(n)
> from withzeroes
> group by q
> order by response, stat;
> quit;
>
>Result:
>
> Obs Response Stat Value
>
> 1 1 Q1_N 2
> 2 1 Q1_Pct 20
> 3 1 Q2_N 0
> 4 1 Q2_Pct 0
> 5 1 Q3_N 3
> 6 1 Q3_Pct 30
> 7 2 Q1_N 0
> 8 2 Q1_Pct 0
> 9 2 Q2_N 4
> 10 2 Q2_Pct 40
> 11 2 Q3_N 3
> 12 2 Q3_Pct 30
> 13 3 Q1_N 2
> 14 3 Q1_Pct 20
> 15 3 Q2_N 1
> 16 3 Q2_Pct 10
> 17 3 Q3_N 1
> 18 3 Q3_Pct 10
> 19 4 Q1_N 4
> 20 4 Q1_Pct 40
> 21 4 Q2_N 3
> 22 4 Q2_Pct 30
> 23 4 Q3_N 2
> 24 4 Q3_Pct 20
> 25 5 Q1_N 2
> 26 5 Q1_Pct 20
> 27 5 Q2_N 2
> 28 5 Q2_Pct 20
> 29 5 Q3_N 1
> 30 5 Q3_Pct 10
>
>Finally, re-shape:
>
> proc transpose data=fortranspose out=want(drop=_name_);
> by response;
> id stat;
> var value;
> run;
>
>Result:
>
> Obs Response Q1_N Q1_Pct Q2_N Q2_Pct Q3_N Q3_Pct
>
> 1 1 2 20 0 0 3 30
> 2 2 0 0 4 40 3 30
> 3 3 2 20 1 10 1 10
> 4 4 4 40 3 30 2 20
> 5 5 2 20 2 20 1 10
>
>The above is a series of relatively simple steps generating intermediate
>results, but developing them depends on knowing the overall requirement.
>
----- Original Message -----
From: Jerry
To: SAS-L@LISTSERV.UGA.EDU
Sent: Monday, July 07, 2008 2:59 PM
Subject: PROC FREQ AND TRANSPOSE
Hello all,
When I tried to count the number of severity of AE by perfered term
using Proc freq, the results do not give those level with missing
count.
perfer term severity count
Dizziness mild 2
Dizziness severe
Dizziness
Headache
Lethargy
Lethargy
Somholence
|