Date: Thu, 2 Oct 1997 13:00:20 -0400
Reply-To: HERMANS1 <hermans1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: HERMANS1 <hermans1@WESTAT.COM>
Subject: Re: Structured sample
If you want, in effect, the Cartesian product of the domains of categorical
column variables, try the SQL solution:
PROC SQL;
CREATE TABLE <tablename> AS
SELECT t1.c1,t2.c2,<etc>
FROM (SELECT DISTINCT c1 FROM <dataset name>) AS t1,
(SELECT DISTINCT c2 FROM <dataset name>) AS t2,
........<etc>
;
QUIT;
This SQL statement produces all possible states of c1,c2,<etc> in dataset d.
It amounts to a SQL join without a WHERE clause constraint. It should produce a
number of rows equal to the number of elements in the domain of c1 times the
number of elements in the c2 times ..........
Alternatively, PROC MEANS (aka SUMMARY) with the NWAY option produces a
frequency of all OBSERVED states of a list of variables named in a CLASS
statement. I believe that you are looking for the Cartesian product instead.
Sig
_________________________ Reply Separator _________________________________
<Subject: Structured sample
<Author: Cal Faircloth <cfairclo@SPRYNET.COM> at Internet-E-Mail
<Date: 10/1/97 10:13 AM
<I have read the messages recently about getting satistical samples. My
<question is one in which I would like to make a data set of each
<possible combination of a series of variables. I am working in a bank
<and for debugging and testing I want a test data set that gives me all
<the possible combinations of note_type, accural_method,
<payment_schedule, etc. I dont care if i get 1 or 10,000 observations,
<only that each combination exists in the original data set.
<Thank you in advance.
|