LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2002)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 29 Oct 2002 15:34:52 -0600
Reply-To:     Carol Albright <calbright@VISI.COM>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Carol Albright <calbright@VISI.COM>
Subject:      Re: Questions about restructuring data in SPSS
Comments: To:
In-Reply-To:  <>
Content-Type: text/plain; charset="us-ascii"

Hi, Valerie & List:

Few pointers to consider:

1) Unique Identifier:

Dig through the data and see what else or what _in addition_ to name you can use as the identifer. Doesn't the clinic generate a medical record number, or store financial information such as the insurance or HMO identifier, SSN or the like? What else is available of a demographic nature, such as gender and birthdate? Name is very unreliable for larger pools of people, I'd hate to think how many Jim Johnsons there are in a big midwestern clinic that are different individuals. You may need to use a complex identifier that's last name + first name + gender + birthdate.

2) Count # of visits or records per case:

Say you're going to use Lastname + Birthdate + Gender as your first attempt as the identifier. You can count how many records are in the dataset for each person, as well as identify how many unique people are in the dataset as follows:

a) Sort your data:


b) Compute a variable to count # of visit records per person; adapt this by including DIAGnosis code to find how many different diagnoses an individual has.

COMPUTE RNUM = 1. IF LNAME = LAG(LNAME) AND DOB = LAG(DOB) AND GENDER = LAG(GENDER) RNUM = LAG(RNUM + 1). var labels rnum "Number of Records per Unique Combination of Lastname, DOB & Gender". exe.

c) Do a frequency count of RNUM.

freq rnum.

Look at how many cases with 1 record you have -- that's how many "people" you estimate you have.

3) Segregate Unique from Repeating Groups of data:

Break your dataset into two pieces AFTER you figure out a fool-proof method for uniquely identifying people. Save the demographic & single-instance data into a new dataset. Select the cases where RNUM = 1 and write to a new data file the identfier variables plus the one-of-a-kinder's (like DOB, gender, ethnicity, etc). (Use either FILTER or TEMPORARY/SELECT to avoid deleting records).

I need to run, but this is a start.


At 11:17 AM 10/29/02 -0500, Valerie Roberts wrote: >I am trying to restructure a data set that was exported from Clinical >Fusion, which contains data from our medical clinic. In the current data >set, a single individual may have multiple cases depending on the number of >times they visited the clinic during the specified period of time. >Therefore, the number of cases in the present data set do not represent the >true number of individuals that utilized the clinic services during the >time period nor am I able to determine how many times an individual visited >the clinc over the time period and received the same diagnosis. I would like >to restructure the data set so that each case represents an individual who >may have had one or multiple clinic visits over the time period. When I try >to do this, I have been using the last name and first name as identifier >variables and no index variables. This causes the new data set to replicate >all of the variables (1 for each clinic visit) including those that do not >change over time (like sex). Also, it appears that some cases are being >dropped if they do not have data in the 'identifier variable'. Can someone >please explain to me how to do this accurately? > >Valerie E. Roberts >Director of Evaluation >Whitefoord Community Programs >1353 Dupont Avenue SE >Atlanta, GA 30317-1743 >phone: (404) 523-2500 >fax: (404) 522-5100 > > > > ------------------------------------------------------------------------- Carol L. Albright, MS | E-Mail : Albright Consulting | Phone : 651/699-7218 St. Paul, MN 55105 USA | Research data services -------------------------------------------------------------------------

Back to: Top of message | Previous page | Main SPSSX-L page