LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 1998)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 27 Jul 1998 09:37:17 EDT
Reply-To:   Robert Greene <MHPERTG@IRIS.RFMH.ORG>
Sender:   "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From:   Robert Greene <MHPERTG@IRIS.RFMH.ORG>
Subject:   Re: Deleting Duplicate Records
In-Reply-To:   Message of Sat, 25 Jul 1998 00:01:20 -0400 from <LISTSERV@VM.MARIST.EDU>

Deleting duplicate records is clilds play when you use the LAG function.

Picture the records as files you want to organize.

First, sort them in order of your key variables. Hopefully, you will have a key variable like SSN and another like EVENTDAY so that you can differentiate records you want from those you do not want.

Following that use SELECT IF with the following syntax:

SORT CASES BY SSN var1. SELECT IF ((LAG(SSN,1) NE SSN) OR (LAG(var1,1) NE var1) OR MISSING (LAG(var1,1))).

The result will be keeping the record attached to the first social security number where the secondary key variable is different from the previous one or missing. You need to work with your select if syntax to be sure that what you keep is what you want. In any event, DO NOT OVERWRITE your original data file. This syntax works well when your data system keeps prior records or changes and is just adding the changes or updates to the data files.

This syntax can also confuse the hell out of you, so you need to keep track of what you expect your output to look like.

Writing this made me realize that I have used this LAG command for years and may never have the ability to explain exactly what it is doing. What the heck. It works for me. ================================================================= MHPERTG 914-374-3171 x3172. RT Greene, Program Evaluation, MHPC New Hampton, New York 10940 USA ================================================================= Information is 'meaning-full' when it is organized and shared and 'meaning-less' when disorganized and kept to ourselves. ================================================================= ..E pluribus, pluribum...Suum cuique.........Pax sapiens......... .'Many out of many.'....'To each his own.'..'Intelligent peace.'. ================================================================= History is indifferent to mediocrity and impatient with failure. ================================================================= Those that refuse to learn about math add up to nothing, naught, nil, nada, nothing, zilch, not much. =================================================================


Back to: Top of message | Previous page | Main SPSSX-L page