Date: Sat, 8 Sep 2007 14:44:43 -0400
Reply-To: Florio Arguillas <foa2@cornell.edu>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Florio Arguillas <foa2@cornell.edu>
Subject: Re: how to do this efficiently
In-Reply-To: <!&!AAAAAAAAAAAYAAAAAAAAANqX5tTQmNNCkHqCQ3JVgsLigAAAEAAAADo
d3RwjlL5Jkq/CT6jgu9sBAAAAAA==@sanigest.com>
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Hi Rodrigo,
Here's another approach. This involves a double-transpose of your data.
1. It requires that you create a subset of your
data which contains ALL records but with only the
following 3 variables: HH, Item, and a variable
that has two or more valid values and best if it
has no system missing values (let's call this
variable XX). I said "best" because it will be
easy to spot which records were added later on if
XX has no system missing values.
2. You then restructure this file to wide form
(CASESTOVARS) to create 11 item variables . The
variables are indexed by the values of your Item
variable and prefixed by your XX variable.
3. Then you restructure this wide form back to
long form (VARSTOCASES). The result will display
a rectangular data structure -- each hh will have
11 records one for each item. You will know
which records in the rectangular file were NOT in
your subset data as they will have missing values displayed in variable XX.
4. You then match this rectangular file to your original data set.
Best regards,
Florio
Here's a sample code.
*Step 1 - here i am creating a sample data, but
in your case you may just have to submit a SAVE
OUTFILE command with a DROP option to create your
subset because you already have an existing SPSS data set.
DATA LIST FREE / HH Item xx.
BEGIN DATA.
1001008011 1 0
1001008011 2 0
1001008011 3 0
1001008011 4 0
1001008011 5 1
1001008011 6 1
1001008011 8 1
1001008011 10 1
1001008011 11 1
1001008021 1 0
1001008021 2 0
1001008021 3 0
1001008021 4 0
1001008021 7 1
1001008021 8 1
1001008021 9 0
1001008021 10 0
1001008021 11 0
END DATA.
XSAVE OUTFILE 'test'.
*Step 2.
SORT CASES BY HH Item .
CASESTOVARS
/ID = HH
/INDEX = Item
/GROUPBY = VARIABLE .
*Step 3 - in this step i commented out the delete
var xx command so you can see what happens when
you restructure back to long form.
*Variable XX is not necessary after this step, so
you can delete it if you want.
VARSTOCASES /MAKE xx FROM xx.1.00 xx.2.00 xx.3.00 xx.4.00 xx.5.00 xx.6.00
xx.7.00 xx.8.00 xx.9.00 xx.10.00 xx.11.00
/INDEX = Item(11)
/KEEP = HH
/NULL = KEEP.
*DELETE VAR xx.
SAVE OUTFILE 'testa'.
*step 4.
MATCH FILE FILE = 'testa'
/FILE = '<your original data set>'
/BY HH Item.
SAVE OUTFILE 'testab'.
At 05:22 PM 9/6/2007, Rodrigo Briceno wrote:
>Dear SPSS users. I have a problem with one of the databases that I’m
>processing.
>
>
>
>My DB has information about Households and their expenditures in a set of
>items (11 items that include food, transport, health, etc.). Each of this
>items has a unique code, for ex. Food is 1, health is 5, etc. The thing is
>that the people who prepared the database didn’t put a value when the HH
>didn’t report expenditures in one or some of the issues.
>
>What I want to do is to locate the HH which do not have the code 1 (for
>food, since this is the first in the list for each HH) in order to have the
>full list of HH no matter if they didn’t reported Food expenditures.
>
>
>
>Ex.
>
>
>
>HH Item
>
>1 1
>
>1 2
>
>1 5
>
>1 4
>
>2 1
>
>2 2
>
>2 5
>
>2 4
>
>3 2
>
>3 3
>
>3 4
>
>3 5
>
>
>
>I would like to introduce a new case with the info for HH 3 on the item 1,
>no matter if the item has just missing values. (I can share an extract of
>the DB for people interested in helping me). Thanks.
>
>__________________________________________________________________
>
>Rodrigo Briceño
>Project Manager
>Sanigest Internacional
>
>+506 291 1200 ext. 113 Oficina Costa Rica
>+506 232 0830 Fax
>+506 886 1177 Celular
> <mailto:mail@sanigest.com> rbriceno@sanigest.com
> <http://www.sanigest.com/> www.sanigest.com
>
>MSN: <mailto:jbric98@hotmail.com> jbric98@hotmail.com
>SKYPE: rbriceno1087
>
>_____________________
>
>This communication contains legal information which is privileged and
>confidential. It is for the exclusive use of the address and distribution,
>dissemination, copying or use by others is strictly prohibited. If you have
>received this communication by error, please delete the original message and
>e-mail us.
>
>
>Esta comunicación contiene información legal privilegiada y confidencial
>para el uso exclusivo del destinatario. La distribución, diseminación, copia
>u otro uso por terceras personas es estrictamente prohibida. Si usted ha
>recibido esta comunicación por error, le rogamos borrar el mensaje original
>y comunicárnoslo a esta misma dirección.
>
>
|