**=========================================================================
****Date:** Mon, 31 Jul 2006 12:55:44 +0200
**Reply-To:** Spousta Jan <JSpousta@CSAS.CZ>
**Sender:** "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
**From:** Spousta Jan <JSpousta@CSAS.CZ>
**Subject:** Re: Distance from cluster centre query.
**Content-Type:** text/plain; charset="US-ASCII"
Hi Mark,

While K-Means operates in a metric Euclidean space or something similar,
and therefore can easily define the centroids (and uses them during the
computing), the Hierarchical algorithm can be used in a more general
topological spaces where there are no well defined centroids. Imagine
clustering species; take a cluster {baboon, human, chimpanzee} - what is
the centroid here? Michael Jackson? Really hard to say. And that is
perhaps the reason why SPSS does not prompt you to save the
centroid-derived statistics.

Otherwise, if you think that they really do give a sense, you can
compute the centroid coordinates easily using Aggregate and add them to
the file. And then you can compute the distance case - centroid using
the familiar formula for the Euclidean distance.

Unfortunately, my SPSS 14 is broken now, so I will draft the example
syntax in SPSS 12 which is more cumbersome because of the lack of
ADDVARIABLES mode in Aggregate.

GET FILE='C:\Program Files\SPSS\Cars.sav'.
SELE IF nmiss(mpg to cylinder)=0 and uniform(1) < 0.2.
DESCRIPTIVES mpg to accel /SAVE.
CLUSTER Zmpg to Zaccel /SAVE CLUSTER(5).

*Save the coordinates of the centroids.
AGGREGATE /OUTF='C:\Program Files\SPSS/aggr.sav' /BREAK=CLU5_1
/Cmpg Cengine Chorse Cweight Caccel = MEAN(Zmpg Zengine Zhorse Zweight
Zaccel).

*Add them to the file.
SORT CASES BY CLU5_1 (A) .
MATCH FILES /FILE=* /TABLE='C:\Program Files\SPSS\aggr.sav' /BY CLU5_1.
exe.

*Compute the Euclidean distance case-centroid.
comp distance = 0.
do repe centr = Cmpg to Caccel /case = Zmpg to Zaccel.
- comp distance = distance + (centr-case)**2.
end repe.
comp distance = sqrt(distance).
var lab distance "Distance case-centroid".
exe.

*End of the example.

Greetings

Jan

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
Mark Webb
Sent: Monday, July 31, 2006 7:43 AM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Distance from cluster centre query.

In K Means it's possible to save this information as a variable.
Is this possible in any of the hierarchical methods offered in SPSS ?
They offer a proximity matrix - which I see as different - as this shows
distances between individual respondents NOT the classification mean.
Am I missing something ?