Date: Wed, 15 Mar 2000 10:17:42 -0500
Reply-To: Mark.K.Moran@CCMAIL.CENSUS.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mark Moran <Mark.K.Moran@CCMAIL.CENSUS.GOV>
Subject: Re: Sized Random Selection of Variables
Content-type: text/plain; charset=us-ascii
Thank you Ian Whitlock, Shiling Zhang, John Whittington, and Paul Dorfman (I
hope I haven't
left anyone out). Now I have so many options I can luxuriate in all my choices
before I decide
what to do with them! It is refreshing to have this kind of collaboration
available at one's
fingertips.
Mark Moran
WHITLOI1 <WHITLOI1@WESTAT.COM> on 03/15/2000 08:24:48 AM
Please respond to WHITLOI1 <WHITLOI1@WESTAT.COM>
To: SAS-L@LISTSERV.VT.EDU
cc: (bcc: Mark K Moran/CSD/HQ/BOC)
Subject: Re: Sized Random Selection of Variables
Subject: Sized Random Selection of Variables
Summary: General code is little harder than specific code.
Respondent: Ian Whitlock <whitloi1@westat.com>
Mark Moran <Mark.K.Moran@CCMAIL.CENSUS.GOV> wants to make a random selection
of presumably numeric variables to match in number the number of variables in
another data set.
For example, given
/* test data */
data std ;
retain z1-z700 1 ;
run ;
data q2 ;
retain a1 - a150 2 ;
run ;
Randomly select 150 variables from STD because there are 150
variables in Q2. This must be don for a number of different Q2 type
data sets with varying numbers of variables.
%macro selvars ( main = main , /* std data set */
data = q , /* current set to match */
match = match /* desired output set */
) ;
/* select #vars in &data from &main
and subset &main to &match
*/
%local nmlist ;
data vars ( keep = name ) ;
length name $ 32 ;
/* set up array from main */
set &main ( obs = 1 ) ;
array v (*) _numeric_ ;
/* get number of vars in &data */
dsid = open ( "&data" ) ;
if dsid > 0 then
do ;
nv = attrn ( dsid , "nvar" ) ;
put nv = ;
dsid = close ( dsid ) ;
end ;
/* make random selection of variables from v */
dimv = dim ( v ) ;
do i = 1 to dim ( v ) ;
if ranuni ( 0 ) < nv / dimv then
do ;
call vname ( v(i) , name ) ;
output ;
nv +- 1 ;
end ;
dimv +-1 ;
end ;
run ;
/* convert to macro variable */
proc sql noprint ;
select trim(name) into :nmlist separated by " "
from vars ;
quit ;
/* make corresponding subset */
data &match ;
set &main ( keep = &nmlist ) ;
run ;
%mend selvars ;
%selvars ( main = std , data = q2 , match = q_std )
I chose to make a corresponding subset, but in fact it is probably
enough to just have a macro variable with this list of variables. If
that is the case, then make NMLIST global and drop the last step.
Ian Whitlock <whitloi1@westat.com>