Date: Fri, 14 May 2004 21:07:29 +0100
Reply-To: Subs <NOradgar-subsSPAM@NTLWORLD.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Subs <NOradgar-subsSPAM@NTLWORLD.COM>
Organization: ntl Cablemodem News Service
Subject: Re: proc sort
I had a similar problem just the other day.
I had a table with say 30 variables. There were duplicate keys, but some
results (related to the keys) were different. I was lucky enough to have a
differentiator: each record had a "commit" timestamp. I wanted only the most
recent record where there were "duplicates".
First approach was:
proc sort data=mydata;
by keys timestamp;
and then:
data mydatanodup;
set mydata;
by keys timestamp;
if last.keyinthelist;
this was OK.
there was also:
proc sql;
unfortunately, SQL isn't my forte and I've not got the code to hand (doh!)
but it used an inline query a bit like this(although syntactically correct):
select * from
(select max(timestamp) as lastdate)
keys, timestamp
where timestamp=lastdate
group by keys;
"helen" <chenghelen2000@yahoo.com> wrote in message
news:fadc20d0.0405140700.11930b3f@posting.google.com...
> Hello All,
>
> I have a dataset contained some duplicate data. I'd like to delete
> those observations. Normally I use 'proc sort ; by listing vars'
> statement to do it. In my case, there are around 60 variables for one
> observation, I'd like to compare 59 variables to see if it is
> duplicate, instead of list all variables, is there any easy way to do
> it?
>
> Thanks in advance.