LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2001, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 21 Feb 2001 21:13:00 -0800
Reply-To:     Chung-Jung Chung <cjc0121@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Chung-Jung Chung <cjc0121@YAHOO.COM>
Subject:      delete a duplicate record
Content-Type: text/plain; charset=us-ascii

I created a HUGE dataset with 604 million records and 22 fields. The size is 55 GBytes. This dataset is sorted by ID and DATE and indexed by ID. Unfortunately, there are two identical records in this dataset.

Obs ID DATE 601008847 A486358342 02/01/2001 601008848 A486358342 02/01/2001

I tried to delete one record by using proc sql. But I got the error message.

proc sql noprint; delete from lib.ts where (id='A486358342') and (mod(_n_,2)=0); quit;

ERROR: Function MOD requires a numeric expression as argument 1. ERROR: The following columns were not found in the contributing tables: _n_.

Actually, I can save this record and delete two records by proc sql. proc sql noprint; delete from lib.ts where (id='A486358342'); quit; Then, I can use proc append to append this record to the dataset. In this case, the physical order is changed and it will need at least 10 hours to have this dataset sorted.

Is there any easy and quick way to do this ?

Thanks in advance.

Chung

__________________________________________________ Do You Yahoo!? Yahoo! Auctions - Buy the things you want at great prices! http://auctions.yahoo.com/


Back to: Top of message | Previous page | Main SAS-L page