LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2001, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 30 Jul 2001 22:50:31 +0100
Reply-To:     Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Organization: Crawford Software Consultancy Limited
Subject:      Re: Eliminate useless vars using proc sql ??

Peetie Wheatstraw <peetie_wheatstraw@HOTMAIL.COM> Writes >Re: Eliminate useless vars using proc sql ?? > >Hi, > >Ever been handed a large flat-file or SAS dataset >for which you had strong priors that, say, >30% of the variables stored only missing data? > >I'm wondering if it might be practical to use >proc sql to eliminate variables with no >non-missing data from a SAS dataset. > >I've seen it done with data steps and a >generated drop stmt. Wasn't pretty.

Without being based on proc sql here is a high performance solutions (and it doesn't need a data step either !)

.. The code is very concise because I broke the style rule of only one statement per line just because it is easier to read I offer open code, but it should be easy to macro-ise >>>>

%let have = csa.cpxfct ; %let want = csa.cpxfctnn ; option _last_ = &have ; proc means noprint ; output; run; *discover min and max for all numeric vars. ; proc transpose name = _stat_; id _stat_; run; *organise as one obs for each numeric var. ; proc sql noprint; * OK only a little sql >> to build missing variables list; select _stat_ into :dropList separated by ' ' from _last_ where min = max and max = . ; drop table &want; quit; proc append base= &want data = &have( drop=&dropList ); run; /* this last step will keep any indexes which remain valid after entirely missing columns are dropped!*/

>>>>>>>>>the log is less concise... 60 %let have = csa.cpxfct ; 61 %let want = csa.cpxfctnn ; 62 option _last_ = &have ; 63 proc means noprint ; output; run;

NOTE: There were 867 observations read from the data set CSA.CPXFCT. NOTE: The data set WORK.DATA10 has 5 observations and 21 variables. NOTE: PROCEDURE MEANS used: real time 0.05 seconds

64 proc transpose name = _stat_; id _stat_; run;

NOTE: There were 5 observations read from the data set WORK.DATA10. NOTE: The data set WORK.DATA11 has 20 observations and 6 variables. NOTE: PROCEDURE TRANSPOSE used: real time 0.00 seconds

65 proc sql noprint; 65 ! * OK only a little sql ; 66 select _stat_ into :dropList separated by ' ' 67 from _last_ 68 where min = max and max = . ; 69 drop table &want; NOTE: Table CSA.CPXFCTNN has been dropped. 70 quit; NOTE: PROCEDURE SQL used: real time 0.05 seconds

71 proc append base= &want data = &have( drop=&dropList ); run; SYMBOLGEN: Macro variable WANT resolves to csa.cpxfctnn SYMBOLGEN: Macro variable HAVE resolves to csa.cpxfct SYMBOLGEN: Macro variable DROPLIST resolves to OURSHARE FEESSTRT FIXEND FIXTERM FRFDATE MATURITY FRFSCORE FEES LIMITCA OUTEFF EXCESS

NOTE: Appending CSA.CPXFCT to CSA.CPXFCTNN. NOTE: BASE data set does not exist. DATA file is being copied to BASE file. NOTE: Composite index T5FACKEY has been defined. NOTE: There were 867 observations read from the data set CSA.CPXFCT. NOTE: The data set CSA.CPXFCTNN has 867 observations and 25 variables. NOTE: PROCEDURE APPEND used: real time 0.04 seconds

Peetie Wheatstraw <peetie_wheatstraw@HOTMAIL.COM> Writes >Re: Eliminate useless vars using proc sql ?? > >Hi, > >Ever been handed a large flat-file or SAS dataset >for which you had strong priors that, say, >30% of the variables stored only missing data? > >I'm wondering if it might be practical to use >proc sql to eliminate variables with no >non-missing data from a SAS dataset. > >I've seen it done with data steps and a >generated drop stmt. Wasn't pretty. > >As a micro-example, say your data looked like: > > id x1 x2 x3 x4 > > 1 1.80482 . 0.39982 . > 2 1.44749 . 0.96928 . > 3 -1.08332 . 0.04979 . > 4 0.98214 . 0.52387 . > 5 0.51366 . 0.95702 . > 6 -0.22058 . 0.68993 . > 7 0.03189 . 0.68824 . > 8 -1.24130 . 0.28723 . > 9 0.68500 . 0.63452 . > 10 -0.89153 . 0.37701 . > >And you _wanted_ a fully programmatic method to >make it look like: > > > id x1 x3 > > 1 1.80482 0.39982 > 2 1.44749 0.96928 > 3 -1.08332 0.04979 > 4 0.98214 0.52387 > 5 0.51366 0.95702 > 6 -0.22058 0.68993 > 7 0.03189 0.68824 > 8 -1.24130 0.28723 > 9 0.68500 0.63452 > 10 -0.89153 0.37701 > >Any/all suggestions welcomed. > > TIA, > Peetie > >_________________________________________________________________ >Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

-- Peter Crawford


Back to: Top of message | Previous page | Main SAS-L page