Date: Wed, 31 Dec 2008 14:25:36 -0500
Reply-To: Ian Whitlock <iw1sas@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <iw1sas@GMAIL.COM>
Subject: Re: Output the Results of a Proc Freq To A SAS Dataset
Content-Type: text/plain; charset=ISO-8859-1
From: Jack Hamilton <jfh@STANFORDALUMNI.ORG>
Subject: Re: Output the Results of a Proc Freq To A SAS Dataset
Jack,
Nice answer.
Incidentally, comparison with your results pointed to two mistakes I
overlooked in my code.
1) assignment of format in data creation is incorrect
2) macro needs a LENGTH statment for NAME&I
Thanks.
Ian Whitlock
===============
Jack Hamilton <jfh@STANFORDALUMNI.ORG> wrote:
I haven't been following this discussion closely, so I may be
overlooking some important aspect of the problem, but why not just use
PROC TABULATE to get one output data set containing all the crosstabs?
=====
ods _all_ close;
proc tabulate data=abc missing out=all_out;
class plan cnt h_cmc
clm_dx_1_cd clm_dx_2_cd
h_ldlc_scrn h_ldlc_lt100;
table plan * cnt * n;
table plan * h_cmc * n;
table plan * h_cmc * clm_dx_1_cd * n;
table plan * h_cmc * clm_dx_2_cd * n;
table plan * H_LDLC_SCRN * H_LDLC_LT100 * n;
run;
ods listing;
=====
You can use the _TYPE_ variable to figure out which rows go with which
table statement.
On Tue, 30 Dec 2008 22:22:01 -0500, "Ian Whitlock" <iw1sas@GMAIL.COM>
said:
> Summary: ODS or OUT= for multi freqs? (with code for OUT=)
> #iw-value=1
>
> In response to my ranting (APOLOGIES to Mary) Andrew wrote in part:
>
> I also think we need to focus on a solution to the problem posed
in
> Gina's original post.
>
> ODS may offer something of a solution here. It's not perfect,
but
> it may save some time and steps versus a completely manual
> approach.
>
> with the code snippet
>
> ods listing close;
> ods output crosstabfreqs(persist=proc match_all) =table;
> proc freq data=elec;
> by region;
> tables office*(cesched serial trans);
> run;
> ods listing;
>
> producing data sets TABLE and TABLE1-TABLE30. (I feel kind of dirty
> when I have to work with the oddball TABLE and wonder at the
> programmer too lazy to either have TABLE0 or go to TABLE31.
> Moreover, it is a nightmare when you work interactively because the
> second time you execute you get TABLE32-TABLE62 and have to remember
> which is which.)
>
> To the point of combining the sets Andrew remarks:
>
> We now have 30 separate SAS data sets, each of which contains the
> PROC FREQ-generated output from a cross-tabulation we requested.
> These tables are created with a minimum of coding and if the
number
> of BY or TABLES statement variables changes, we don't have to do
> very much to obtain all the tables we want after adding a
variable
> or two to the PROC FREQ "step," as our project needs change.
>
> The difficulty is "lining up" which data set is associated with
> which requested cross-tab. Unfortunately, the programmer will
have
> to figure that out by, most likely, using PROC EYEBALL to look at
> each one. Additional data step programming may be necessary to
> modify and then append the tables together, if need be. But, at
> the very least, we have all the PROC FREQ-generated tables in SAS
> data sets for subsequent use.
>
> Yes, this is the crux of the problem - easy to make data sets, a
> nightmare to combine them. Andrew's solution leaves the problem to
> the end. Since I have put recent thought into the problem I decided
> to push it to a complete general solution (Mary left out cross tabs
> and Andrew left out one way freqs, and both omitted the hard part.)
>
> The basic problem is that PROC FREQ was never designed to carry all
> the information needed to combine tables. ODS makes the classic
> mistake of burying information in the variable names instead of
data.
> So the question is where and when to get the combining information.
I
> decided to make it a user's problem, like Andrew, but to put it up
> front where a macro can make use of it. Moreover, I decided to make
a
> macro for standardizing one OUT= data set at a time, so the code is
> compromise between what could be done if a managing macro collected
> required information from the user and used system information where
> possible to set up and run the PROC FREQ.
>
> To test the code I returned to the original specification with made
up
> data.
>
> proc format ;
> value yn 0 = "NO" 1 = "YES" ;
> run ;
>
> data abc ;
> do plan = 1 to 3 ;
> do obs = 1 to 20 * ranuni(123) ;
> cnt = round ( ranuni(123) ) ;
> h_cmc = round(ranuni(123)) ;
> clm_dx_1_cd = round ( ranuni(123) ) ;
> clm_dx_2_cd = round ( ranuni(123) ) ;
> H_LDLC_SCRN = round ( ranuni(123) ) ;
> H_LDLC_LT100 = substr("ABCDEF", 3*round(ranuni(123))+1, 3)
;
> output ;
> end ;
> end ;
> format hmc YN. ;
> run ;
>
> proc freq data=abc ;
> by plan;
> table cnt / out=f1 noprint missing ;
> table h_cmc / out=f2 noprint missing ;
> table h_cmc*clm_dx_1_cd / out=f3 noprint missing ;
> table h_cmc*clm_dx_2_cd / out=f4 noprint missing ;
> table H_LDLC_SCRN*H_LDLC_LT100 / out=f5 noprint missing ;
> run ;
>
> Note that both character and numeric data with formats have been
> considered. The 5 tables are standardized with
>
> %standard_freq(data = f1, tablenum = 1
> , vars = plan cnt
> , fmts = best12. best6.
> )
> %standard_freq(data = f2, tablenum = 2
> , vars = plan h_cmc
> , fmts = best12. yn6.
> )
> %standard_freq(data = f3, tablenum = 3
> , vars = plan h_cmc clm_dx_1_cd
> , fmts = best12. yn6. best8.
> )
> %standard_freq(data = f4, tablenum = 4
> , vars = plan h_cmc clm_dx_2_cd
> , fmts = best12. yn6. best8.
> )
> %standard_freq(data = f5, tablenum = 5
> , vars = plan H_LDLC_SCRN H_LDLC_LT100
> , fmts = best12. best6. $char8.
> )
>
> Moreover, with this set up it is easy to insure, by eyeball that
each
> column of values has a consistent width. The code is repetitive,
but
> it could be collected by parsing or generating the PROC FREQ and
using
> PROC CONTENTS information for data set ABC.
>
> The advantage of having this structure is that it doesn't care where
> or when the freq output sets were created, and it doesn't preclude
> writing a managing macro to make the calls to STANDARD_FREQ. It
would
> be interesting to see the corresponding macro for the ODS data sets.
> The one advantage that ODS has is that there is more information.
On
> the other hand, for me the cell counts are everything.
>
> Here is the macro - surprisingly short and simple once one puts the
> burden on the user to write the calls.
>
> %macro standard_freq ( data = f3, outsuffix = _c
> , tablenum = 1
> , vars = plan h_cmc clm_dx_1_cd
> , fmts = best12. $char12. best12.
> ) ;
> %local vcnt out;
> %let vcnt = %sysfunc(countw(&vars)) ;
> %let out = &data&outsuffix ;
>
> data &out ( keep = name: v: count ) ;
> retain tablenum &tablenum ;
> set &data ;
> %do i = 1 %to &vcnt ;
> name&i = "%scan(&vars,&i)" ;
> v&i = put(%scan(&vars,&i),%scan(&fmts,&i,%str( )) -L) ;
> %end ;
> run ;
> %mend standard_freq ;
>
> --
> Ian Whitlock
--
Jack Hamilton
Sacramento, California
jfh@alumni.stanford.org <== Use this, not jfh @ stanfordalumni.org
|