Date: Mon, 14 Jul 2008 16:06:55 -0500
Reply-To: Warren Schlechte <Warren.Schlechte@TPWD.STATE.TX.US>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Warren Schlechte <Warren.Schlechte@TPWD.STATE.TX.US>
Subject: Proc SurveySelect - Parsing a Dataset into Multiple Pieces
Content-Type: text/plain; charset="us-ascii"
Below is some code I have produced to parse a large dataset into "n"
distinct sections, each of which I am randomly assigning to a treatment
(so I have "n" treatments). Note that each dataset holds &ss. of the
observations from the Input_Data, and each observation from Input_Data
can belong to one and only one of the data subsets.
I thought I should be able to do this in one pass through SurveySelect,
but cannot figure out how to, if indeed it can be done.
I have coded variations on this using random numbers, sorts, and
datasteps, so I'm not looking for an alternative data step method.
What I want to know is, can this be done simply, with a single pass
through the data, using Proc SurveySelect or other procedure? David
Cassell suggests I shouldn't be loopy, and yet I cannot see how I can
achieve what I want to do without being loopy.
%macro loop;
data hold; set Input_Data; /* Transfer data to dataset "in" */
/* Loop to extract a selected number of datasets, each time pulling 100
observations */
%do i2 = 1 %to &n.;
proc surveyselect data=hold seed=12 sampsize=&ss. method=srs
out=a_&i2.out;
run;
/* Compare base dataset with extracted datatset. Delete extracted
observations from base */
/* This forms the new base for the next loop */
proc sql;
create table temp as
select * from hold
except select * from a_&i2.out;
quit;
data hold; set temp; run;
%end;
%mend;
%loop;
Thanks,
Warren Schlechte
HOH Fisheries Science Center
5103 Junction Hwy
Mt. Home, TX 78058
Phone 830.866.3356 x.214
Fax 830.866.3549