LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2011, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 18 Aug 2011 13:44:18 -0400
Reply-To:     "Bian, Haikuo" <HBian@FLQIO.SDPS.ORG>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Bian, Haikuo" <HBian@FLQIO.SDPS.ORG>
Subject:      Re: outer join mishap
Comments: To: Robert Feyerharm <rfeyerha@TRAVELERS.COM>
In-Reply-To:  <201108181640.p7IESlTA014312@waikiki.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"

Robert,

It happened once to me when your condition variables contain lots of missing values: in your case, sai and eff_dt. Please see below:

data a; infile cards missover; input a b ; cards; 1 2 2 3 ;

data b; infile cards missover; input a b ; cards; 1 2 2 3 ;

proc sql; select * from b right join a on a.b=b.b ;quit;

SAS does not know which one to match, as they are all "equal" missing values, so SAS ends up matching them all.

Regards, Haikuo

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Robert Feyerharm Sent: Thursday, August 18, 2011 12:40 PM To: SAS-L@LISTSERV.UGA.EDU Subject: outer join mishap

I'm having some problems performing a simple outer join in proc sql.

I'm trying to link two insurance claims datasets by policy number and policy date. I'm running the following code:

proc sql; create table merged_sai_effdt as select *, (a.sai=b.sai and a.eff_dt=b.eff_dt) as ind, (a.sai=b.sai) as sai_ind, (a.eff_dt=b.eff_dt) as effdt_ind from all_markets_valid_nomiss as a right join natl_accts_valid_nomiss as b on a.sai=b.sai and a.eff_dt=b.eff_dt; quit;

The all_markets_valid_nomiss dataset contains 112,575 records, and the natl_accts_valid_nomiss dataste contains 112,683 records. Yet when I perform the right join, I end up with a merged dataset containing 20,751,797 records, when the end result should be a right joined dataset containing 112,683 records. There are no missing values for the linking variables in either dataset, and the linking variables share the same type in both datasets.

Any ideas what's going wrong here?

Thanks for your help!

Robert ----------------------------------------- Email messages cannot be guaranteed to be secure or error-free as transmitted information can be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The Centers for Medicare & Medicaid Services therefore does not accept liability for any error or omissions in the contents of this message, which arise as a result of email transmission.

CONFIDENTIALITY NOTICE: This communication, including any attachments, may contain confidential information and is intended only for the individual or entity to which it is addressed. Any review, dissemination, or copying of this communication by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message.


Back to: Top of message | Previous page | Main SAS-L page