Date: Tue, 17 Dec 2002 20:17:33 -0500
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: new "clashvars" macro
Content-Type: text/plain; charset="iso-8859-1"
Let me admit at the onset that I have not studied Roland's solution enough
to understand the nature of the problem it purports to solve. That, of
course, won't deter me from proposing a SQL solution ;> :
%let libname=WORK; /* For example. */
proc sql noprint;
create table _clash as
select * from dictionary.columns
where libname="%upcase(&libname)" and memtype='DATA'
group by name,type,length,format,label
order by name, memname, type, length, format, label;
quit;
This program searches a SAS library (generally an OS path) for variables
that have the same name and other attributes (depending on the list of
column names in the GROUP BY clause). Note that the GROUP BY column list
does not include MEMNAME, variables duplicated across datasets appear in the
yield of this query, ordered by name and dataset name. By the rules of SAS
datasets, each class of variable,dataset appears only once.
You can probably use a variant of this query to do any number of comparisons
across variables. In fact, you missed getting that assignment for real by
jumping to Brown!
Looks like another SQLhead in the making: Ian, Paul, Puddin', and, now,
Quentin. Miracles abound! That makes my Christmas.
Sig
-----Original Message-----
From: Quentin and Amanda McMullen
[mailto:qandamcmullen@POP.MAIL.RCN.NET]
Sent: Monday, December 16, 2002 7:49 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: new "clashvars" macro
Hi Roland et al.,
I'm very much enjoying browsing your macros.
As a mostly data-step kind of guy, I can see the step-by-step logic in:
> proc sql noprint;
> create table _clash as
> select * from dictionary.columns
> where libname="%upcase(&libname)" and memtype='DATA'
> order by name, memname, type, length, format, label;
> quit;
>
> proc sort nodupkey data=_clash(keep=name type length format label)
> out=_clashbad;
> by name type length format label;
> run;
>
> data _clashbad;
> set _clashbad;
> by name;
> if last.name and not first.name then output;
> keep name;
> run;
>
> data _clash;
> merge _clashbad(in=_bad) _clash;
> by name;
> if _bad;
> run;
But since every once in a while I get tempted to try out some SQL just for
the fun of it, I'm wondering if folks think the following step might be a
reasonable replacement for the 4 steps above:
proc sql;
create table _clash as
select name, memname, type, length, format, label
from dictionary.columns
where libname="WORK" and memtype="DATA"
group by name
having min(type) ne max(type)
or min(length) ne max(length)
or min(format) ne max(format)
or min(label) ne max(label)
order by name, memname, type, length, format, label
;
quit;
Surely this is the sort of problem which must have a nice SQL
solution. Have I stumbled across it, or could someone illustrate one?
Kind Regards,
--Quentin
|