Date: Thu, 9 Nov 2006 11:48:31 -0800
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
Subject: Re: find and drop empty variables
Content-Type: text/plain; charset="us-ascii"
The probility for a USUAL variable with all missing values is very
small. I think it is a good way to select a small sample from your
LARGE data set (gigbytes). You can quickly find out a small list of
variables with all missing value in your sample data. Then proc freq,
proc mean against whole data set for these variables ONLY. This will
save you a lot of time when you work on LARGE data set. Attached below
is a sample pgm to illustrate the idea and you may modify for your
array nv(*) n1-n30;
array cv(*) $ c1-c30;
do i = 1 to 1000000;
do j=1 to dim(nv);
do j=1 to dim(cv);
if cv(j)='0' then cv(j)=' ';
keep n: c:;
proc format library=work;
proc sql ;
into : cvarlist separated by ';'
where libname='WORK' and memname='T1' and type='char'
do i=1 to nobs by 1000;
set t1 point=i nobs=nobs;
proc means data=t1v noprint;
output out=varlist(drop=_freq_ _type_) n=;
array allv(*) _numeric_;
do i = 1 to dim(allv);
if substr(vname,1,1)='_' then vname=substr(vname,2);
if allv(i)=0 then put 'var with all missing value in 1/1000
sample --->' vname;
Alexander Norderwind wrote:
> I have the problem that a big database I import to SAS with proc sql and
> ODBC has about thousand variables (fields) and lots of them are empty all over.
> Thus I would like dropping them to get rid of this ballast.
> Therefore I first have to find out, which of them really are empty.
> Does anybody have a suggestion how to solve my problem?
> Would be nice!