Date: Thu, 27 Oct 2005 22:37:46 -0400
Reply-To: Chang Chung <chang_y_chung@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Chang Chung <chang_y_chung@HOTMAIL.COM>
Subject: Re: HELP WITH MACRO
On Thu, 27 Oct 2005 16:26:04 -0400, Kevin F. Spratt
<Kevin.F.Spratt@DARTMOUTH.EDU> wrote:
>Please provide some basic insight regarding this pitiful
>attempt to write a macro that will generate a number of random
>samples and then concatenate them into a single data set with an added variable
>that indicated the set number.
...
>%MACRO IMPUTESETS(DATAIN= DATAOUT= START= TOTAL= SIZE=
...
Hi, Kevin,
Once I have a hammer in my hands, everything looks like a nail head --
somehow problems look like chances to use the hash object in one way or
another. Since you are using a simple random sampling without replacement,
the hash is an easy way to implement it in a data step (as I do in the
selectSample: below). Or I feel that way, at least today. :-)
This way, you don't have to use a macro at all. Also, assuming the main
dataset is not so large, the combination of sasfile and set with point
should make this *very* quick to run.
I am sure David, Dale or others will find errors or problems in this code,
if any. :-) HTH.
Cheers,
Chang
/* load the main data file into memory for speed */
sasfile sashelp.shoes load;
/* output 1000 repetitions of a random sample
of size 50 */
%let SEED = 1234567;
data one;
/* "globals" */
nRep = 1000;
nSize = 50;
retain OK 0;
retain obs selected .;
dcl hash h;
dcl hIter hi;
/* main loop */
do rID = 1 to nRep;
link resetHash;
link selectSample;
link doOutput;
end;
stop;
/* subroutines */
resetHash:;
/* if not missing(h) then h.delete(); */
h = _new_ hash(ordered:'a');
h.defineKey('obs');
h.defineData('obs','selected');
h.defineDone();
return;
selectSample:;
do until(h.num_items=nSize);
obs = ceil(nObs*(ranuni(&seed.)));
selected = 1;
if h.add()^=OK then _error_ = 0;
end;
return;
doOutput:;
hi = _new_ hIter('h');
if hi.first()=OK then do until(hi.next()^=OK);
set sashelp.shoes nObs=nObs point=obs;
/* here you do creating and renaming vars */
oldObs = obs; /* otherwise obs will be */
/* dropped automatically */
keep rID oldObs region -- returns;
output;
end;
return;
run;
/* on log
NOTE: The data set WORK.ONE has 50000 observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 1.23 seconds
user cpu time 0.95 seconds
system cpu time 0.18 seconds
Memory 24183k
*/
/* eyeball checks */
/* check1: all the oldObs should appear
about the same time */
proc freq data=one;
tables oldObs/list missing;
where oldObs <= 20;
run;
/* check2: the mean of any summary stat should be (roughly)
normally distributed */
proc summary data=one noprint;
class rID;
var returns;
ways 1;
output out=summOne mean=avgReturns;
run;
proc univariate data=summOne;
var avgReturns;
qqplot / normal;
run;
/* unload the dataset and free the sasfile buffers */
sasfile sashelp.shoes close;