Date: Tue, 3 Dec 2002 10:31:15 -0000
Reply-To: Jean Russell <j.russell@sheffield.ac.uk>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Jean Russell <j.russell@sheffield.ac.uk>
Subject: Re: String variables: breaking a list into it's parts
In-Reply-To: <5.2.0.9.2.20021203092254.00ad6da0@clyde.its.unimelb.edu.au>
Content-Type: text/plain; charset="us-ascii"
These solutions that read out and read in do not solve the problem they just
transfer it.
The question is what is the maximum possible number of repeats. If she does
not know the maximum number of items in a list then she does not know how
many variables to read in!
I to the respondent last night a dirty way to find the max possible. Here it
is more fully.
She can obtain the maximum possible numbers of characters in a string by
either asking for file info or looking at the string length of the variable
in the variable view. Lets call that l
As she also knows the length of the string she is searching for (lets call
that s).
The maximum possible number of repitions is trunc((l+2)/(s+2)).
Example
-------
That is equivalent of if the list being 22 characters and the search string
"cheese" then the list string would be:"cheese, cheese, cheese, ch" which as
l is 26 and s is 6 we get
trunc((26+2)/(6+2))=trunc(28/8)=3
I have a feeling this could be mechanised by combining script and syntax but
that is beyond me.
Jean M. Russell
-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]On Behalf Of
Simon Freidin
Sent: 02 December 2002 22:26
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Re: String variables: breaking a list into it's parts
At 01:06 PM 12/2/2002 -0500, Johnson, Wendy SBCCOM(N) wrote:
>Hello,
>
>I often have open-ended responses in my data sets, including lists of
>items separated by commas, like so:
>What I want is to find out how many times each item was mentioned. To do
>this, I've come up with some syntax which will take a list and break it
>down into individual items (see below). Basically it works, but I have to
>know (or guess) the greatest
>number of items in any given list. This will never be the case. Any
>suggestions (or general comments) would be greatly appreciated.
set printback=listing mxwarns=0.
data list list (tab)/id (f4) list (a80).
begin data
2341 skittles
3333 cheese, peanut butter
4213 beef curry, green eggs and ham, skittles
1125 cigarettes, beverage base, m&ms, gatorade, twix
1238 gum, peanuts, grape juice, coffee, potato chips
end data.
execute.
/* write data out so can read back comma delimited (up to 1000 30-char
items per case) */
write outfile='c:\temp\list.txt'/id ',' list.
execute.
data list file='c:\temp\list.txt' list (",")/id (f4) item1 to item1000
(1000a30).
execute.
/* write out individual items (if they exist) comma delimited */
do repeat x=item1 to item1000.
do if x ne ''.
write outfile='c:\temp\list2.txt'/id ',' x.
end if.
end repeat.
execute.
/* read individual items */
data list file='c:\temp\list2.txt' list (",")/id (f4) item (a30).
execute.
/* strip space prefixes and set to lower case (or use upcase if you prefer)
*/
if substr(item,1,1)=' ' item=substr(item,2).
compute item=lower(item).
freq vars=item.
Research Database Manager and Analyst
Melbourne Institute of Applied Economic and Social Research
The University of Melbourne
MELBOURNE VIC 3010
Tel: (03) 8344 3601 Fax: (03) 8344 5630