LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2006, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 5 Sep 2006 05:43:35 -0400
Reply-To:     Arild S <sko@KLP.NO>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Arild S <sko@KLP.NO>
Subject:      Re: Comparing two datasets

On Tue, 5 Sep 2006 02:05:35 -0700, alves <alves.paulo@GMAIL.COM> wrote:

>Hi, > >I was given a dataset 20 variables plus a key variable that should be >unique. When looking in detail, I notice a few duplicate records (same >key variable) and I separate these into another dataset. Now I need to >compare the duplicates with the ones in the original dataset to check >if they are really duplicated or if any of the values in the other 20 >variables is different... > >Any easy way of doing this? It happen to me once in a dataset with 5 >variables and I just rename the variables, merge the two files and with >arrays compared all the variables, but I am looking for a more >efficient way. > >Thanks in advance

The easy way is different from what you do :-) Don't split your data. Use proc sort, it has a nice option called "noduprecs":

data test; input (key a b c d )($); cards; a b c d e f a x x x x x s e f g h j a d f e f g s d f g r h a b c d e f ; run;

proc sort data=test noduprecs dupout=test2; by _all_; run;

Duplicate records will now be found in the dupout= <dataset>. Read the documentation, though! Also on the Sortdup= system option.


Back to: Top of message | Previous page | Main SAS-L page