Date: Tue, 12 Nov 2002 13:41:11 -0500
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: SQL, join or union or...
Content-Type: text/plain; charset="iso-8859-1"
(case |
when one.keyvar is null then two.keyvar } = {
coalesce(one.keyvar,two.keyvar)
else one.keyvar |
end) as keyvar |
-----Original Message-----
From: David Rubanowice [mailto:rubanowice.d@GHC.ORG]
Sent: Tuesday, November 12, 2002 11:04 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: SQL, join or union or...
proc sql;
create table comlist1 as
select (case
when one.keyvar is null then two.keyvar
else one.keyvar
end) as keyvar,
(case
when one.keyvar is null then 2
when two.keyvar is null then 1
else 3
end) as membership
from (select distinct keyvar from ds1) as one
full join
(select distinct keyvar from ds2) as two
on one.keyvar = two.keyvar;
quit;
>>> Talbot Katz <TopKatz@MSN.COM> 11/08/02 01:48PM >>>
Hey, gang.
Must be time for another one of my stupid sql stumpers (stumping me -- not
you!). I want a combined list of unique keys from two different files.
That's easy enough to do with a union --
proc sql;
create table comlist1 as
select * from
(select distinct keyvar from ds1)
union corr
(select distinct keyvar from ds2);
quit;
but, of course, I'm never satisfied with something quite so simple. I want
to add a membership flag -- 1 if keyar is in ds1 only, 2 if keyvar is in
ds2 only, 3 if keyvar is in both datasets. I do this frequently with data
step merges as follows :
proc sort data = ds1 (keep = keyvar) out = ds1k nodupkey;
by keyvar;
run;
proc sort data = ds2 (keep = keyvar) out = ds2k nodupkey;
by keyvar;
run;
data comlist2;
merge ds1k (in = in1) ds2k (in = in2);
by keyvar;
keep keyvar membership;
if in1 then do;
if in2 then do;
membership = 3;
end;
else do;
membership = 1;
end;
end;
else if in2 then do;
membership = 2;
end;
run;
I have a way of doing this with proc sql, but it's extremely clunky. To
begin with, it requires that keyvar is a fixed length character variable.
Then, you'll see that it concatenates the keyvar values from the two files,
and lops off one of them (If the keyvar is numeric or non fixed length,
some times I can force it to behave.)
* &lk holds the fixed length of keyvar ;
proc sql;
reset noprint;
create table comlist3 as
select distinct substr(compress(ds1.keyvar || ds2.keyvar),1,&lk.) as
keyvar,
sum(ds1.membership,ds2.membership) as membership
from
(select distinct keyvar, 1 as membership from ds1k) ds1
full join
(select distinct keyvar, 2 as membership from ds2k) ds2
on ds1.keyvar = ds2.keyvar;
quit;
It seems to me there should be a way of doing this where the keyvar value
is the value from whichever data set it comes from on records which don't
match, and the matched value on records that do match.
Thanks!
-- TMK --