Date: Tue, 15 Nov 2005 07:53:28 -0600
Reply-To: Keith Kaiser <keith.kaiser@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Keith Kaiser <keith.kaiser@GMAIL.COM>
Subject: Base vs. SQL
Content-Type: text/plain; charset=ISO-8859-1
Here is a wide open question for the list.
Ten SAS data sets, 2 million rows each, maybe 10 columns.
I need to extract any multiple records out of them. A record has a column
called IDNUM, it is a unique identifier for the data, and all 10 data sets
are indexed and sorted by it. The only other key variable is a date field
called PDATE, think patient and treatment date.
How do I extract just the duplicates rows by IDNUM? I need duplicates by
year, and across year, so if it's in the 1995 data and again in 2002 I need
it, but I also need it if it is duplicated in 1995 only.
Do I set the 10 data sets together first? Do I use some complex SQL, is
their a method I don't know that makes it easy? After I get the new data set
with just the dupes I can do the rest I'm just looking for a simple, quick
method of finding all dupes and keeping them.