|Date: ||Mon, 27 Jul 1998 09:37:17 EDT|
|Reply-To: ||Robert Greene <MHPERTG@IRIS.RFMH.ORG>|
|Sender: ||"SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>|
|From: ||Robert Greene <MHPERTG@IRIS.RFMH.ORG>|
|Subject: ||Re: Deleting Duplicate Records|
|In-Reply-To: ||Message of Sat, 25 Jul 1998 00:01:20 -0400 from
Deleting duplicate records is clilds play when you use the LAG function.
Picture the records as files you want to organize.
First, sort them in order of your key variables. Hopefully, you will have a
key variable like SSN and another like EVENTDAY so that you can differentiate
records you want from those you do not want.
Following that use SELECT IF with the following syntax:
SORT CASES BY SSN var1.
SELECT IF ((LAG(SSN,1) NE SSN)
OR (LAG(var1,1) NE var1)
OR MISSING (LAG(var1,1))).
The result will be keeping the record attached to the first social security
number where the secondary key variable is different from the previous one
or missing. You need to work with your select if syntax to be sure that what
you keep is what you want. In any event, DO NOT OVERWRITE your original data
file. This syntax works well when your data system keeps prior records or
changes and is just adding the changes or updates to the data files.
This syntax can also confuse the hell out of you, so you need to keep track of
what you expect your output to look like.
Writing this made me realize that I have used this LAG command for years and
may never have the ability to explain exactly what it is doing. What the heck.
It works for me.
MHPERTG 914-374-3171 x3172. RT Greene, Program Evaluation, MHPC
New Hampton, New York 10940 USA
Information is 'meaning-full' when it is organized and shared and
'meaning-less' when disorganized and kept to ourselves.
..E pluribus, pluribum...Suum cuique.........Pax sapiens.........
.'Many out of many.'....'To each his own.'..'Intelligent peace.'.
History is indifferent to mediocrity and impatient with failure.
Those that refuse to learn about math add up to nothing, naught,
nil, nada, nothing, zilch, not much.