LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2001, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 19 Apr 2001 11:26:44 -0700
Reply-To:     "Terjeson, Mark" <TerjeMW@DSHS.WA.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Terjeson, Mark" <TerjeMW@DSHS.WA.GOV>
Subject:      Re: duplicate records
Comments: To: "Karen Olson, Ph.D." <karen.olson@TCH.HARVARD.EDU>
Content-Type: text/plain; charset=us-ascii

Hi Karen,

Here's are with and without sorted methods:

data sample; time_sec=25; infcode='Inon';output; time_sec=25; infcode='Ineu';output; time_sec=30; infcode='Inon';output; time_sec=41; infcode='Ipos';output; time_sec=44; infcode='Inon';output; time_sec=44; infcode='Ineu';output; time_sec=45; infcode='Inon';output; run;

data sample2; time_sec=41; infcode='Ipos';output; time_sec=25; infcode='Inon';output; time_sec=25; infcode='Ineu';output; time_sec=30; infcode='Inon';output; time_sec=45; infcode='Inon';output; time_sec=44; infcode='Inon';output; time_sec=44; infcode='Ineu';output; run;

* if the file is already sorted ; * then the first. last. works ; data myfile1; set sample; by time_sec; if last.time_sec; run;

* if not sorted, but the time_sec ; * values are grouped together, ; * and no missing time values. ; data TrickToKeepLastValue; run; data myfile2(keep=holdtime holdcode rename=(holdtime=time_sec holdcode=infcode)); retain holdtime . holdcode ' '; set sample2 TrickToKeepLastValue; if time_sec ne holdtime and holdtime ne . then output; holdtime = time_sec; holdcode = infcode; run;

Hope this is helpful, Mark Terjeson Washington State Department of Social and Health Services Division of Research and Data Analysis (RDA) mailto:terjemw@dshs.wa.gov

-----Original Message----- From: Karen Olson, Ph.D. [mailto:karen.olson@TCH.HARVARD.EDU] Sent: Thursday, April 19, 2001 10:46 AM To: SAS-L@LISTSERV.UGA.EDU Subject: duplicate records

I have some data files from a video coding system that creates records with two fields of interest: a code and a time. The time value is hours:minutes:seconds:frames and frames are 30 per second. In the end, I want a file with the time in seconds. I have most of the conversion done but am stuck with one problem. Rounding the original time value to the nearest second sometimes gives me >1 record with the same time. I want to keep only the LAST of these duplicates. All the tricks I've tried (e.g. proc sort nodupkeys; by subjid time_sec;) have given me the FIRST of the duplicates and eliminated the rest. I tried 'descending time_sec' and various things in data steps (e.g. the lag function) but do not end up with what I want.

Here's some sample data: time_sec infcode 25 Inon 25 Ineu 30 Inon 41 Ipos 44 Inon 44 Ineu 45 Inon

I get: 25 Inon 30 Inon 41 Ipos 44 Inon 45 Inon

But I want: 25 Ineu 30 Inon 41 Ipos 44 Ineu 45 Inon

Any suggestions? Thanks so much in advance.

Karen Olson Children's Hospital, Boston karen.olson@tch.harvard.edu


Back to: Top of message | Previous page | Main SAS-L page