LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
=========================================================================
Date:         Mon, 31 Jul 2006 12:55:44 +0200
Reply-To:     Spousta Jan <JSpousta@CSAS.CZ>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Spousta Jan <JSpousta@CSAS.CZ>
Subject:      Re: Distance from cluster centre query.
Comments: To: Mark Webb <targetlk@iafrica.com>
Content-Type: text/plain; charset="US-ASCII"

Hi Mark,

While K-Means operates in a metric Euclidean space or something similar, and therefore can easily define the centroids (and uses them during the computing), the Hierarchical algorithm can be used in a more general topological spaces where there are no well defined centroids. Imagine clustering species; take a cluster {baboon, human, chimpanzee} - what is the centroid here? Michael Jackson? Really hard to say. And that is perhaps the reason why SPSS does not prompt you to save the centroid-derived statistics.

Otherwise, if you think that they really do give a sense, you can compute the centroid coordinates easily using Aggregate and add them to the file. And then you can compute the distance case - centroid using the familiar formula for the Euclidean distance.

Unfortunately, my SPSS 14 is broken now, so I will draft the example syntax in SPSS 12 which is more cumbersome because of the lack of ADDVARIABLES mode in Aggregate.

GET FILE='C:\Program Files\SPSS\Cars.sav'. SELE IF nmiss(mpg to cylinder)=0 and uniform(1) < 0.2. DESCRIPTIVES mpg to accel /SAVE. CLUSTER Zmpg to Zaccel /SAVE CLUSTER(5).

*Save the coordinates of the centroids. AGGREGATE /OUTF='C:\Program Files\SPSS/aggr.sav' /BREAK=CLU5_1 /Cmpg Cengine Chorse Cweight Caccel = MEAN(Zmpg Zengine Zhorse Zweight Zaccel).

*Add them to the file. SORT CASES BY CLU5_1 (A) . MATCH FILES /FILE=* /TABLE='C:\Program Files\SPSS\aggr.sav' /BY CLU5_1. exe.

*Compute the Euclidean distance case-centroid. comp distance = 0. do repe centr = Cmpg to Caccel /case = Zmpg to Zaccel. - comp distance = distance + (centr-case)**2. end repe. comp distance = sqrt(distance). var lab distance "Distance case-centroid". exe.

*End of the example.

Greetings

Jan

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Mark Webb Sent: Monday, July 31, 2006 7:43 AM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Distance from cluster centre query.

In K Means it's possible to save this information as a variable. Is this possible in any of the hierarchical methods offered in SPSS ? They offer a proximity matrix - which I see as different - as this shows distances between individual respondents NOT the classification mean. Am I missing something ?


Back to: Top of message | Previous page | Main SPSSX-L page