Date: Fri, 11 Oct 2002 21:13:34 -0400
Reply-To: Raynald Levesque <rlevesque@videotron.ca>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Raynald Levesque <rlevesque@videotron.ca>
Subject: Re: Looking for help with syntax-data manipulation
In-Reply-To: <1F1A2BE0E6BCD511AF960002A589D4FD215EAE@plum>
Content-type: text/plain; charset=iso-8859-1
Hi
This is one approach:
DATA LIST LIST /diseases(A16) meats dairy eggs medicine drug perfume oil
(7F1).
BEGIN DATA
'bone cancer' 1 1 . 1 . 1 .
'Brain cancer' 1 . 1 . 1 . 1
'Lung cancer' . 1 . . . . 1
END DATA.
LIST.
COMPUTE nobreak=1.
SAVE OUTFILE='c:\temp\original data.sav'.
FLIP VARIABLES=meats dairy eggs medicine drug perfume oil.
COMPUTE nobreak=1.
COMPUTE idx=$CASENUM.
STRING food1 TO food7(A8).
VECTOR food=food1 TO food7.
COMPUTE food(idx)=case_lbl.
AGGREGATE OUTFILE=*
/PRESORTED
/BREAK=nobreak
/food1 TO food7=MAX(food1 TO food7).
MATCH FILES TABLE=*
/FILE='c:\temp\original data.sav'
/BY=nobreak.
STRING foodlist(A255).
VECTOR foodc=meats TO oil /food=food1 TO food7.
LOOP cnt=1 TO 7.
IF foodc(cnt)=1 foodlist=CONCAT(RTRIM(foodlist),' ',food(cnt)).
END LOOP.
SUMMARIZE
/TABLES=diseases foodlist
/FORMAT=VALIDLIST NOCASENUM TOTAL
/TITLE='List of food associated with diseases'
/MISSING=VARIABLE
/CELLS=NONE .
******************.
This is the output:
List of food associated with diseases
DISEASES FOODLIST
1 bone cancer MEATS DAIRY MEDICINE PERFUME
2 Brain cancer MEATS EGGS DRUG OIL
3 Lung cancer DAIRY OIL
HTH
Raynald Levesque rlevesque@videotron.ca
Visit my SPSS Pages http://pages.infinit.net/rlevesqu/index.htm
-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]On Behalf Of
Gao, Peter
Sent: October 11, 2002 4:50 PM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Looking for help with syntax-data manipulation
Dear list members,
I hope someone can help me with this:
I have a matrix of variables, the record line (case) is about disease
names, the variables (columns) are names of foods or medicines. Not all
entries have a value (have missing values here and there). I am trying to
produce an output that will show one food associated with how many diseases
or one disease associated with how many foods.
Example of my data (actual data file is large with 53 columns):
diseases meats dairy eggs medicine drug perfume oil
bone cancer 1 1 . 1 . 1
.
Brain cancer 1 . 1 . 1 .
1
Lung cancer . 1 . . . .
1
output or data will be something like:
bone cancer: meats, dairy medicine perfume
Brain cancer: meats, eggs, drug, oil
Lung cancer: dairy, oil
The resulting list has varied number of variables.
I wish someone with the expertise can write the syntax to do it. Many thanks
in advance.
Have a nice weekend.
Peter Gao