LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2001)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 5 Mar 2001 11:39:52 -0500
Reply-To:     William Dudley <william.dudley@NURS.UTAH.EDU>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         "Goodwin, Jay" <JGoodwin@AIR.ORG>
Subject:      Re: Duplicate records
Content-Type: text/plain; charset="iso-8859-1"

Another alternative is to use the aggregate function as below...

aggregate outfile="c:\temp\aggregated_data_file.sav" /break=name /addr=first(address) /email=first(email) /info=first(info). execute.

I don't recall off the top of my head if aggregate will handle string variables, but I believe it does. If it doesn't, then the lag alternative described by Dr. Dudley below is quite effective and probably your best bet.

Sincerely,

Jay Goodwin

-----Original Message----- From: William Dudley To: SPSSX-L@LISTSERV.UGA.EDU Sent: 03/05/2001 10:51 AM Subject: Re: Duplicate records

Geraldine,

If the fields are identical then it might work to use the lag function to create a new variable called "extra" and then use the select command like this.

Compute extra = 0. execute.

if name = lag(name) extra = 1. execute.

select if extra = 0.

execute.

You may also have condition the compute on more than one field

If name = lag(name) and phone = lag(phone) extra = 1.

NOTE be very careful with the select command! DO NOT save your datafile over top your old file.

Bill

*********************************************** * William N. Dudley, PhD * University of Utah * College of Nursing * 10 S 2000 E Front * Salt Lake City, UT 84122-5880 ***********************************************

>>> Geraldine Anderson <Geraldine.Anderson@IBEC.IE> 03/05/01 08:11AM >>> Hi all, I was wondering if anyone was aware of a way to delete duplicate records in SPSS (version 8). What I am dealing with is a large datafile of names, addresses, email addresses etc. What I would like to do is to have only one record for each person. Short of going through the file and deleting the extra ones, is there a more efficient way to do this? Thanks in advance to anyone who can help. Geraldine _____________________________ Geraldine Anderson Executive Telephone - +353 1 6051512 Fax - +353 1 6381512 Email - geraldine.anderson@ibec.ie Website(s) - http://www.ibec.ie

CONFIDENTIALITY NOTICE - The information contained in this email message is intended only for the confidential use of the named recipient. If the reader of this message is not the intended recipient or the person responsible for delivering it to the recipient, you are hereby notified that you have received this communication in error and that any review, dissemination or copying of this communication is strictly prohibited. If you have received this in error, please notify the sender immediately.


Back to: Top of message | Previous page | Main SPSSX-L page