Date: Mon, 30 Jul 2001 22:50:31 +0100
Reply-To: Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Organization: Crawford Software Consultancy Limited
Subject: Re: Eliminate useless vars using proc sql ??
Peetie Wheatstraw <peetie_wheatstraw@HOTMAIL.COM> Writes
>Re: Eliminate useless vars using proc sql ??
>
>Hi,
>
>Ever been handed a large flat-file or SAS dataset
>for which you had strong priors that, say,
>30% of the variables stored only missing data?
>
>I'm wondering if it might be practical to use
>proc sql to eliminate variables with no
>non-missing data from a SAS dataset.
>
>I've seen it done with data steps and a
>generated drop stmt. Wasn't pretty.
Without being based on proc sql here is a high performance solutions
(and it doesn't need a data step either !)
..
The code is very concise because I broke the style rule
of only one statement per line
just because it is easier to read
I offer open code, but it should be easy to macro-ise >>>>
%let have = csa.cpxfct ;
%let want = csa.cpxfctnn ;
option _last_ = &have ;
proc means noprint ; output; run;
*discover min and max for all numeric vars. ;
proc transpose name = _stat_; id _stat_; run;
*organise as one obs for each numeric var. ;
proc sql noprint;
* OK only a little sql
>> to build missing variables list;
select _stat_ into :dropList separated by ' '
from _last_
where min = max and max = . ;
drop table &want;
quit;
proc append base= &want data = &have( drop=&dropList ); run;
/* this last step will keep any indexes which remain valid
after entirely missing columns are dropped!*/
>>>>>>>>>the log is less concise...
60 %let have = csa.cpxfct ;
61 %let want = csa.cpxfctnn ;
62 option _last_ = &have ;
63 proc means noprint ; output; run;
NOTE: There were 867 observations read from the data set CSA.CPXFCT.
NOTE: The data set WORK.DATA10 has 5 observations and 21 variables.
NOTE: PROCEDURE MEANS used:
real time 0.05 seconds
64 proc transpose name = _stat_; id _stat_; run;
NOTE: There were 5 observations read from the data set WORK.DATA10.
NOTE: The data set WORK.DATA11 has 20 observations and 6 variables.
NOTE: PROCEDURE TRANSPOSE used:
real time 0.00 seconds
65 proc sql noprint;
65 ! * OK only a little sql ;
66 select _stat_ into :dropList separated by ' '
67 from _last_
68 where min = max and max = . ;
69 drop table &want;
NOTE: Table CSA.CPXFCTNN has been dropped.
70 quit;
NOTE: PROCEDURE SQL used:
real time 0.05 seconds
71 proc append base= &want data = &have( drop=&dropList ); run;
SYMBOLGEN: Macro variable WANT resolves to csa.cpxfctnn
SYMBOLGEN: Macro variable HAVE resolves to csa.cpxfct
SYMBOLGEN: Macro variable DROPLIST resolves to OURSHARE FEESSTRT
FIXEND FIXTERM FRFDATE MATURITY FRFSCORE FEES LIMITCA
OUTEFF EXCESS
NOTE: Appending CSA.CPXFCT to CSA.CPXFCTNN.
NOTE: BASE data set does not exist.
DATA file is being copied to BASE file.
NOTE: Composite index T5FACKEY has been defined.
NOTE: There were 867 observations read from the data set CSA.CPXFCT.
NOTE: The data set CSA.CPXFCTNN has 867 observations and 25 variables.
NOTE: PROCEDURE APPEND used:
real time 0.04 seconds
Peetie Wheatstraw <peetie_wheatstraw@HOTMAIL.COM> Writes
>Re: Eliminate useless vars using proc sql ??
>
>Hi,
>
>Ever been handed a large flat-file or SAS dataset
>for which you had strong priors that, say,
>30% of the variables stored only missing data?
>
>I'm wondering if it might be practical to use
>proc sql to eliminate variables with no
>non-missing data from a SAS dataset.
>
>I've seen it done with data steps and a
>generated drop stmt. Wasn't pretty.
>
>As a micro-example, say your data looked like:
>
> id x1 x2 x3 x4
>
> 1 1.80482 . 0.39982 .
> 2 1.44749 . 0.96928 .
> 3 -1.08332 . 0.04979 .
> 4 0.98214 . 0.52387 .
> 5 0.51366 . 0.95702 .
> 6 -0.22058 . 0.68993 .
> 7 0.03189 . 0.68824 .
> 8 -1.24130 . 0.28723 .
> 9 0.68500 . 0.63452 .
> 10 -0.89153 . 0.37701 .
>
>And you _wanted_ a fully programmatic method to
>make it look like:
>
>
> id x1 x3
>
> 1 1.80482 0.39982
> 2 1.44749 0.96928
> 3 -1.08332 0.04979
> 4 0.98214 0.52387
> 5 0.51366 0.95702
> 6 -0.22058 0.68993
> 7 0.03189 0.68824
> 8 -1.24130 0.28723
> 9 0.68500 0.63452
> 10 -0.89153 0.37701
>
>Any/all suggestions welcomed.
>
> TIA,
> Peetie
>
>_________________________________________________________________
>Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
--
Peter Crawford