Date: Thu, 13 Jul 2006 12:23:38 -0400
Reply-To: Kevin Roland Viel <kviel@EMORY.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Kevin Roland Viel <kviel@EMORY.EDU>
Subject: Re: Compare two data sets
In-Reply-To: <A4F0DBF1F84D4F46A0E6D56405D3511989E317@HHAEXMB03.rf01.itservic
es.ca.gov>
Content-Type: TEXT/PLAIN; charset=US-ASCII
On Thu, 13 Jul 2006, Choate, Paul@DDS wrote:
> Proc sort data=a nodupkey force;
> By income state zip score;
> Proc sort data=b nodupkey force;
> By income state zip score;
> Data differences;
> Merge a(in=ina) b(in=inb);
> By income state zip score;
> If not (ina or inb);
> If ina then FileA='Y';
> If inb then FileB='Y';
> Run;
>
> If either sort reports a record reduction then your data are not unique.
> Any records appearing in only one of the two will be in Differences with
> the appropriate flag.
As these are temporary datasets, my warning may not be relevant. You can
avoid a potential loss in the even that your data are not unique by using
the OUT= option to the SORT statement:
Proc sort data = a
out = b
dupkey
force
;
by income state zip score ;
run ;
Of course, if these datasets are large and RAM is confined, then you might
have to consider alternatives, like using views or writing to a
(temporary) permanent dataset...
HTH,
Kevin
Kevin Viel
Department of Epidemiology
Rollins School of Public Health
Emory University
Atlanta, GA 30322
|