LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 1996, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 10 May 1996 15:41:34 +0200
Reply-To:     Gordon Meyer <meyer_g@MTN.CO.ZA>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         Gordon Meyer <meyer_g@MTN.CO.ZA>
Organization: MTN
Subject:      Re: Data step programming problem - Help!

Andrew Cosmatos wrote: > > Tadd Clayton wrote: > > > > Hi everyone > > > > I have a data set in which each observation should be uniquely > > identified by a combination of two variables - a school number > > and a serial number within each school. However, due to problems > > with the data entry process, there are a number of observations > > with duplicate values for the combination of school and serial > > numbers. > > > > I would like to be able to compare the school and serial numbers > > for each observation with those from the previous observation and > > then output *both* observations to a data set if they are > > duplicated. I can use a retain statement to define variables > > that will carry the school and serial numbers over iterations of > > the data step to allow the comparison but, as SAS appears to work > > on an observation by observation basis only, I can't figure out > > how to output both observations. Can anyone offer a simple > > solution? > > > > Thanks for any help. > > > > Tadd > > > > -- > > Tadd Clayton Ph: 64 9 373 7599 ext. 6451 > > Research Officer Fax: 64 9 373 7486 > > Department of Paediatrics Email: t.clayton@auckland.ac.nz > > School of Medicine > > University of Auckland > > Private Bag 92019 > > Auckland > > NEW ZEALANDTadd, > > I had a similar problem I solved it in the following manner: > > data tmp; > set school; > x=1; > run; > > proc sort data=tmp; > by schoolno serialno; > run; > > proc means data=tmp noprint; > by schoolno serialno; > var x; > output out=tmp1 sum=; > id x y z; > run; > > data dups; > set tmp1; > if x>1; > run; > > In the data set dups the duplicates will be listed and how many times > they were duplicated would be stored in var. x. > > Andrew.

The above code is correct, except that the id statement should include only variables that are NOT used in the VAR or BY statements i.e. x in this case.

Gordon South Africa


Back to: Top of message | Previous page | Main SAS-L page