| Date: | Sun, 7 Jan 2007 14:07:18 -0500 |
| Reply-To: | Peter Crawford <peter.crawford@BLUEYONDER.CO.UK> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Peter Crawford <peter.crawford@BLUEYONDER.CO.UK> |
| Subject: | Re: Drop with a variable name pattern? |
|
On Fri, 5 Jan 2007 23:35:58 -0800, David L Cassell <davidlcassell@MSN.COM>
wrote:
>swovcc@HOTMAIL.COM sagely replied:
>>
>>On Fri, 5 Jan 2007 15:18:03 -0500, souga soga <souga1234@GMAIL.COM>
wrote:
>>
>> >Hi ,
>> >
>> >The following syntax would drop all variables that begin with X from a
>> >dataset
>> >DROP X:;
>> >How do you drop a variable from a dataset that END with X ?
>> >
>> >
>> >Thanks,
>> >Sa
>>
>>This is a good opportunity to query the meta data.
>>
>>data test ;
>> array foo (6) aaa sbx sby fkax wkscx kuyd (6*1);
>> array bar (8) $ kshnbx kkst mcvy jkx ksoix nbdtk msusax lllwsa
(8*"b");
>>run ;
>>
>>proc sql noprint ;
>> select name into : droplist separated by " "
>> from dictionary.columns
>> where libname = "WORK" and
>> memname = "TEST" and
>> substr(upcase(name),(length(trim(name)))) = "X" ;
>>quit ;
>>
>>%put &droplist ;
>>
>>data afterdrop ;
>> set test ;
>> drop &droplist ;
>>run ;
>>
>>options nocenter ;
>>proc print data = afterdrop ;
>>run ;
>>
>>Obs aaa sby kuyd kkst mcvy nbdtk lllwsa
>>
>> 1 1 1 1 b b b b
>>
>>Venky Chakravarthy
>
>Nice.
>
>Or, instead of:
>
>> substr(upcase(name),(length(trim(name)))) = "X" ;
>
>one could use
>
> prxmatch('/X\s*$/i',name);
>
>
>But this only works in SAS 9.1 . Prior to 9.1, you cannot wedge
>the regex into the PRXMATCH function, so this won't fly in
>PROC SQL.
>
>HTH,
>David
>--
>David L. Cassell
>mathematical statistician
>Design Pathways
>3115 NW Norwood Pl.
>Corvallis OR 97330
>
>_________________________________________________________________
>Fixing up the home? Live Search can help
>http://imagine-windowslive.com/search/kits/default.aspx?
kit=improve&locale=en-US&source=hmemailtaglinenov06&FORM=WLMTAG
I'm sure you all know this, but, just in case~~~~~~~~~~~~~~~~
all this stuff about DROP-ing variables.... having common suffix in
metadata ~~~~~~~~~~~~.
If we're talking about a data step, then the problem extends ~~~.
What if there is more than one contributing data set?
What about variables that are created in the step ?
Wouldn't it, might it, be nice to have a run-time list of variables?
As seen by the compiler ~~~~~~~~~~~~~~
From this we would be able to filter out the variables with a
common suffix for this kind of DROP statement.
Well we have... It is the output data set itself (unreduced) !
Just run the data setp without the DROP. Then build the required
list of variables to be on the DROP statement, and feed it back
into the code for the next run.
If it is considered bad form to modify our code based on its
results, then consistently, I argue that it is bad form to DROP
an undefine-able list of variables.
Much better, would be to place something like a keep= list (or
if very much more relevant to the problem and short, a drop=
list) of variables for any contributing data set, at the point
where they contribute. That is not the point where the DROP
statement takes effect. These definitions for keep= and drop=
are dataset options.
If a drop statement is used, "damage" may already be done with
unwanted variables contributing run-time or compile-time
errors when, for example, the unwanted variables have
inconsistent data types.
So I emphasise the caution, that the programmer should automate
only for the "input-side" of the data step, and not the "output-
side", i.e. do not automate a filter for the DROP statement.
This does not mean that I want to stop using drop statements, like
DROP __X: ;
It means that I won't support generating variable lists from
input data sets, to use on a DROP statement, because that is at
the output stage.
DROP __X: ;
serves very well, because we know what will be affected. On
input datasets, we cannot be certain of _future_ contents.
Just my (labored) two cents........
Peter
|