Date: Thu, 8 May 2003 17:11:14 +0000
Reply-To: sashole@bellsouth.net
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject: Re: ****HOW TO Count Duplicate Records
Content-Type: text/plain; format=flowed
Cybie,
Assuming that the file is grouped by BRANCH DATE, the task fits in the
DoW-loop like a glove:
data report ;
do group_dupes = 0 by 1 until ( last.date ) ;
set a ;
by branch date notsorted ;
end ;
total_dupes ++ group_dupes ;
run ;
In the case with you sample data, it should result in the output like this:
group_ total_
acct_num branch date dupes dupes
100 PH 20030303 2 2
100 PH 20030304 2 4
200 MN 20030303 3 7
HTH.
Kind regards,
-----------------------
Paul M. Dorfman
Jacksonville, FL
-----------------------
>From: Cybie Frontier <cybie@HOTMAIL.COM>
>
>Hi Folks:
>
>I have a fairly large SAS dataset and when I checked for duplicate records
>(all fields have the same value) I did proc sort nodupkey. My key was all
>the columns. I came up with 300K duplicate records. There are 24 variables
>in the dataset.One of the key variables is account_number. I want to
>achieve two things:
>
>Assign a dupe counter to each record. In other words, if an
>account_number occurs 7 times with the same record, I want to have
>a dupecounter variable that will be 1 for the first record for that
>account_number, 2 for the second record for that account_number, where all
>variables are identical.
>
>Create a separate data sets with the duplicate records (not just
>duplicate account numbers). These are true duplicates, i.e. values in all
>fields are same.
>
>Example:
>
>acct_num branch date dupecounter
>100 PH 20030303 1
>100 PH 20030303 2
>100 PH 20030303 3
>100 PH 20030304 1
>100 PH 20030304 2
>100 PH 20030304 3
>100 MN 20030303 1
>200 MN 20030303 2
>200 MN 20030303 3
>200 MN 20030303 4
>
>etc.......
>
>Now, I want to produce a report that will tell me the number of total
>duplicates. In this case it will be 7. The dupecounter where the value of
>the field is 1 is not to be counted as a duplicate.
_________________________________________________________________
The new MSN 8: smart spam protection and 2 months FREE*
http://join.msn.com/?page=features/junkmail