LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (December 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 22 Dec 2008 11:38:16 -0500
Reply-To:     Gene Maguin <emaguin@buffalo.edu>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Gene Maguin <emaguin@buffalo.edu>
Subject:      Re: Syntax for keeping or dropping records
In-Reply-To:  <3e56e7d5fd31c8ac369872b25a1a76c4.squirrel@webmail.umaryland.edu>
Content-Type: text/plain; charset="iso-8859-1"

Amy,

You should know that color coding does not come through to the list.

OK. I've rearranged what you posted in far more usable structure (see below). To summarize how I now understand things. You have two files: Data1 and Data2, each made as you describe. Data1 is a file of patients meeting your selection criteria and having one record per patient. That record is for the bone marrow transplant (BMT) treatment. Data2 has multiple records per patient, each record being an incident of chemotherapy. You want to separate the chemotherapy incidents in Data2 into two groups based on the BMT incident date in Data1.

I'm now going to assume that you are very skilled with spss. I think you can do a match files using the table subcommand to match Data1 as the table file to Data2 using ID. I think you need only a subset of the variables in Data1, probably just ID and the in and out date variables. This little operation explicitly assumes that you have exactly one record per patient in Data1 and exactly one record in Data2 for each combination of ID and in and out date. If you don't, then you have more trouble. Not insurmountable trouble, but definitely more.

Once the match files is complete, you can compare in and out dates from the Data2 records against those from the Data1 records to identify pre and post BMT incidents.

Does this help you?

Gene Maguin

**************************************** The examples of data are messy. So, I repost it again. The original big claims data (hospitalization) are &ldquo;dd2001, dd2002, dd2003, dd2004, dd2005, and dd2006.&rdquo; They are monthly claims data and have same variables. If patients were hospitalized longer than the monthly reporting date, the claims data had > 1 record for the patients at the same admission and discharge dates. I saw one patient (identified by id, birthday, in_date, and out_date) who was hospitalized for > 1 year, the claims data had around 12 records (or lines or rows) at the same date of admission (e.g., 20010101) and discharge (e.g., 20020202). In_date is the admission date and the out_date is discharge date.

My target population is adults (> 18 years) with hematological cancers receiving bone marrow transplant (BMT) from 2001 to 2005. First, I have selected hematological cancers from dd2001 to dd2006 using ICD-9-CM diagnostic codes (from icd9cd to icd9cd4) and added annual data set as DATA1. Second, I have limited the target population to patients undergoing BMT using 10 ICD-9-CM procedure codes (from icdopcd to icdopcd4). 10 ICD-9-CM procedure codes for BMT are from 4100 to 4109. Third, I converted birthday and admission dates and calculated ages. Fourth, I recoded age into 2 groups and selected age 18 years old. Fifth, I have created an index dd2001_2006 using aggregating (selecting the first record and last record and summing different fees) and merging functions (adding cases again). Thus, DATA1 is an index dd2001_2006 and only 1 record per patient. If patients had received 2nd, 3rd, 4th, or subsequent BMT, those variables will be added to the DATA1 using different names of variables. It is occasionally hard to judge the admission date only for BMT due to coding problems so that I need pre-BMT chemotherapy records for checking and making decisions (exclude or not exclude patients).

2 outcomes are overall survival (from Jan 1, 2001 to Dec 31, 2005) and 30 day readmission of discharge. The variables of death and date of death have existed in the DATA1 for several patients because patients have died during BMT. Thus, the variables of overall survival for remaining patients, who survive during BMT, will be obtained from dd2001 to dd2006. Also, the variable of with readmission or without readmission will be obtained from dd2001 to dd2006 again. Hence, I have created syntax for selecting those adult patients undergoing BMT using their unique ID (32 length) and saved as &ldquo;DATA2.&rdquo; However, data2 include all records (rows) with respect to pre-, during, and post-BMT records. I am thinking how to create syntax for keeping pre-BMT chemotherapy records as one dataset and post-BMT records as one dataset or dropping BMT records from DATA2. The key variables for identifying pre-, during, or post-BMT are each admission date and discharge date from dd2001 to dd2006, although patients have same id and birthday. The in_date and out_date of pre-BMT records occur before in_date and out_date of BMT procedures, whereas the in_date and out_date of post-BMT occur after in_date and out_date of BMT procedures. Please see below examples:

DATA1 (Index dd2001_2006 only BMT records):

id Id_sex birthday In_date Out_date E_bedd Tran_cd Icd9cd Icd9cd1 icdopcd Icdopcd1 Dx_am Room_am Drug_am Med_am 1122ab33c5.. F 19580210 20011215 20020208 48 1 20500 6822 4103 9925 11664 44160 315227 473461 1134ac34c6.. M 19751122 20050719 20051130 134 4 20153 99685 4105 8607 69120 904218 722973 2579172 2456b578ef.. F 19690516 20030113 20030204 22 3 20021 2880 8607 4101 11897 137262 138717 378661 ab2457cdg3.. M 19501030 20050413 20050720 98 3 20500 2880 9925 4105 40099 358053 831632 1482244

DATA2 (including pre-BMT, during BMT, and post-BMT records):

id Id_sex birthday In_date Out_date E_bedd Tran_cd Icd9cd Icd9cd1 icdopcd Icdopcd1 Dx_am Room_am Drug_am Med_am 1122ab33c5.. F 19580210 20030805 20031001 57 2 20500 03482 9925 8607 15942 61320 237694 462431 1122ab33c5.. F 19580210 20030805 20031001 1 3 20500 1975 9925 8607 546 1095 0 2005 1122ab33c5.. F 19580210 20011215 20020208 48 1 20500 6822 4103 9925 69120 904218 722973 2579172 1134ac34c6.. M 19751122 20050719 20051130 134 4 20153 99685 4105 8607 69120 904218 722973 2579172 2456b578ef.. F 19690516 20030113 20030204 22 3 20021 2880 8607 4101 11897 137262 138717 378661 2456b578ef.. F 19690516 20031025 20031204 40 2 20400 2880 Blank Blank 9963 34155 59627 177133 2456b578ef.. F 19690516 20031025 20031204 40 4 20400 486 0392 9925 2184 7245 55737 88608 ab2457cdg3.. M 19501030 20050413 20050720 49 2 20500 2880 9925 4105 16107 55125 364826 633212 ab2457cdg3.. M 19501030 20050413 20050720 30 2 20500 2880 9925 3893 15075 196530 119210 471444 ab2457cdg3.. M 19501030 20050413 20050720 19 3 20500 03842 3324 9925 10469 147747 80434 295218 ab2457cdg3.. M 19501030 20050817 20051011 45 2 20500 Blank Blank Blank 13885 50625 190254 418414 ab2457cdg3.. M 19501030 20050817 20051011 10 5 20500 2880 9925 Blank 3573 11250 95807 173013

Please show me how to create syntax for keeping pre-BMT and post-BMT records as two separated files. Thank you so much.

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page