Date: Mon, 24 Jan 2011 09:34:45 -0500
Reply-To: Gene Maguin <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Gene Maguin <firstname.lastname@example.org>
Subject: Re: Summation
Content-Type: text/plain; charset="us-ascii"
I picture your data as being either persons within communities within
regions or just communities within regions. Not sure which but I think the
first case. So then the value for community population is repeated across
cases in each community. Given this, it seems you need to pick a single case
in each community so that the sum of communities is the region population.
If you agree, then I don't think there is a way to do this within the
summarize command although I'm sure others are better with this command than
I am. I'd do either of these initial steps. Either
1) sort cases by region and community and number cases within
community-region groups, select (temporary, select if would be fine) the
first case, and then run the summarize command.
2) Aggregate the file breaking on region and community and keep either the
first, last or mean of the population variable.
On balance, I'd prefer 1) over 2) because the working dataset is not
>>I got 1190 individual cases along with informations about the population
(number) in the community and the belonging to a certain region (string with
4 categories). There are different numbers of cases in each community and
I'd like to summarize now the population in each of the 4 regions, but of
course counting a community only once.
My following syntax doesn't work, because it takes the sum of population for
all the cases by region, not only each community once:
/TABLES=population BY region
/CELLS=COUNT SUM .
Have I to aggregate with community first?
Thanks for any help.
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command