LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2008, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 23 May 2008 11:58:18 -0400
Reply-To:     Mike Rhoads <RHOADSM1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mike Rhoads <RHOADSM1@WESTAT.COM>
Subject:      Re: Merging on either the ID or AltID
In-Reply-To:  <941871A13165C2418EC144ACB212BDB04E15D8@dshsmxoly1504g.dshs.wa.lcl>
Content-Type: text/plain; charset="us-ascii"

The only potential problem I can see with Dan's solution is if the 2nd data set has the same value for ID in one record as for ALTID in a second record. In that case, the ID will occur twice in the output. As an example, add the record below to the example data set:

002-00 002-01 XX

A slight modification merges two instances of the data set separately, using the COALESCE function to "prefer" the Allele4 value from the record where the value is in ID rather than ALTID:

create table want2 as select a.*, coalesce(b1.a4,b2.a4) from have1 as a left join have2 as b1 on a.ID = b1.id left join have2 as b2 on a.id = b2.alt_id ;

Mike Rhoads Westat RhoadsM1@Westat.com

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Nordlund, Dan (DSHS/RDA) Sent: Thursday, May 22, 2008 5:08 PM To: SAS-L@LISTSERV.UGA.EDU Subject: RE: Merging on either the ID or AltID

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On > Behalf Of Mary > Sent: Thursday, May 22, 2008 1:53 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Merging on either the ID or AltID > > OK, new question! > > I have one data set that has one ID in it. Example, ID and 3 > variables (Allele results Allele1 Allele2 Allele3) > > 001-01 AA CC AC > 002-01 AC AA AC > 003-01 AA AC AC > 004-01 AA AA AA > > I have another data set that has both an ID and sometimes a > second Alternate ID in it (ID, ALTID, Allele4) I'm putting > XXX-XX to show the place but in my data they are just missing. > > 001-01 XXX-XX GT > 002-01 XXX-XX TT > 003-00 003-01 GT > > I would like to create a data set that has all the records > from the first data set, and merges on the records from the > second data set, but if it can't find a match on the primary > ID of the second data set, then I want to try to find a match > on the second ID. > > Thus I'd get additonal data for 001-01, 002-01, and also for > 003-01 (because it finds no primary ID matching, but finds > the alternate ID matching), but I wouldn't have any > additional data for 004-01. I do want all the data in the > first table even if it can't find any matching data in the > second table (i.e., a left outer join). > > So I'd like to get: > > 001-01 AA CC AC GT > 002-01 AC AA AC TT > 003-01 AA AC AC GT > 004-01 AA AA AA . >

Mary,

Does this work for you

data have1; input id $ a1 $ a2 $ a3 $ ; cards; 001-01 AA CC AC 002-01 AC AA AC 003-01 AA AC AC 004-01 AA AA AA ; run; data have2; input id $ alt_id $ a4 $ ; cards; 001-01 XXX-XX GT 002-01 XXX-XX TT 003-00 003-01 GT ; run; proc sql; create table want as select a.*, b.a4 from have1 as a left join have2 as b on a.ID = b.id or a.id = b.alt_id ; quit;

Hope this is helpful,

Dan

Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204


Back to: Top of message | Previous page | Main SAS-L page