Date: Fri, 15 May 1998 10:15:55 -0600
Reply-To: Jack Hamilton <jack_hamilton@HCCOMPARE.COM>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Jack Hamilton <jack_hamilton@HCCOMPARE.COM>
Subject: Re: removing duplicates with NODUPKEY????
Content-Type: text/plain; charset=US-ASCII
Alan Rimm-Kaufman <rimmkaufman@CRUTCHFIELD.COM> wrote:
>There's been prior discussion this year on SAS-L about how
>NODUPKEY sometimes doesn't remove duplicates...
>I recall folks doing arduous multiple sorts to ensure no dups.
>
>I wasn't tuned into those discussions, but now the issue has become relevant to
>me.
>
>I've heard if you sort twice, that'll remove dups, eg
>
> proc sort; by a b c d; run;
> proc sort nodupkey; by a b c d; run;
>
>Is that correct? Can someone fill me in on this issue?
See the following SAS Usage Notes:
V6-SORT-B272
Using the SAS sort with NODUPLICATES/NODUPKEY and EQUALS
<http://www.sas.com/service/techsup/unotes/V6/B272.html>
V6-SORT-1729
Observations with duplicate BY values not deleted when NODUPKEY used
<http://www.sas.com/service/techsup/unotes/V6/1729.html>
V6-SQL-D929
PROC SQL DISTINCT or PROC SORT NODUPKEY output may
still have duplicates
<http://www.sas.com/service/techsup/unotes/V6/D929.html>
The major problems seem to be with NODUPLICATES rather than NODUPKEYS.