Date: Wed, 11 Mar 1998 10:19:42 PST
Reply-To: TWB2%Rates%FAR@bangate.pge.com
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Tim Berryhill 3rd time <TWB2%Rates%FAR@BANGATE.PGE.COM>
Subject: Re: similar cases
Content-Type: text/plain; charset=us-ascii
Roughly, sort the data and print it. The details depend on what sort of error
you expect.
Suppose that you think 5 variables have errors and n-5 variables are perfect.
Call them BADVAR1-BADVAR-5 and GOODVAR1-GOODVAR9. You could try:
PROC SORT DATA=ALL.MYDATA OUT=MAYFIX;
BY GOODVAR1-GOODVAR9; * Gee, can I use a list in a sort ?;
RUN;
DATA _NULL_;
SET MAYFIX;
BY GOODVAR1-GOODVAR9;
IF FIRST.GOODVAR9
THEN PUT // 'New group: ' GOODVAR1-GOODVAR9;
PUT BADVAR1-BADVAR5;
RUN;
Tim Berryhill - Contract Programmer and General Wizard
TWB2@PGE.COM or http://www.aartwolf.com/twb.html
Frequently at Pacific Gas & Electric Co., San Francisco
The correlation coefficient between their views and
my postings is slightly less than 0
----------------------[Reply - Original Message]----------------------
Sent by:"Fabrizio De Amicis" <fabrizio.de-amicis@JRC.IT>
Hi SAS-L,
I have a SAS dataset with n variables and m observations. Each observation
represents a case.
Two or more observations are different because some variables values are
different, but actually are the same case.
Do you have any suggestion to detect the similar or duplicate cases?
Does anyone have experience in this problem?
Thank you,
Fabrizio.
=====================================================================