LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
=========================================================================
Date:         Mon, 31 Jul 2006 13:27:04 +0200
Reply-To:     Mark Webb <targetlk@iafrica.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Mark Webb <targetlk@iafrica.com>
Subject:      Re: Distance from cluster centre query.
Comments: To: Spousta Jan <JSpousta@CSAS.CZ>
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
              reply-type=original

Thanks for this Jan. I may well use your suggestion & compute the centroids BUT would like to discuss the idea of a cluster centroid in the context of what I'm trying to do. I'm finding that discriminant analysis [DA] based on clusters[dep var] & the statements used to make the clusters [indep vars] are not working well in practice. I would like to remove "weakly"associated respondents from each clusters and put them into an additional cluster representing "unclassifiable". I was hoping to define these weak respondents by using the distance from centriod idea but I use Hierarchical methods [Wards] most often - hence my initial querry. Do you think what I'm suggesting is feasible ? I would then run DA on the original clusters plus 1.

Regards

Mark

----- Original Message ----- From: "Spousta Jan" <JSpousta@CSAS.CZ> To: "Mark Webb" <targetlk@iafrica.com>; <SPSSX-L@LISTSERV.UGA.EDU> Sent: Monday, July 31, 2006 12:55 PM Subject: RE: Distance from cluster centre query.

Hi Mark,

While K-Means operates in a metric Euclidean space or something similar, and therefore can easily define the centroids (and uses them during the computing), the Hierarchical algorithm can be used in a more general topological spaces where there are no well defined centroids. Imagine clustering species; take a cluster {baboon, human, chimpanzee} - what is the centroid here? Michael Jackson? Really hard to say. And that is perhaps the reason why SPSS does not prompt you to save the centroid-derived statistics.

Otherwise, if you think that they really do give a sense, you can compute the centroid coordinates easily using Aggregate and add them to the file. And then you can compute the distance case - centroid using the familiar formula for the Euclidean distance.

Unfortunately, my SPSS 14 is broken now, so I will draft the example syntax in SPSS 12 which is more cumbersome because of the lack of ADDVARIABLES mode in Aggregate.

GET FILE='C:\Program Files\SPSS\Cars.sav'. SELE IF nmiss(mpg to cylinder)=0 and uniform(1) < 0.2. DESCRIPTIVES mpg to accel /SAVE. CLUSTER Zmpg to Zaccel /SAVE CLUSTER(5).

*Save the coordinates of the centroids. AGGREGATE /OUTF='C:\Program Files\SPSS/aggr.sav' /BREAK=CLU5_1 /Cmpg Cengine Chorse Cweight Caccel = MEAN(Zmpg Zengine Zhorse Zweight Zaccel).

*Add them to the file. SORT CASES BY CLU5_1 (A) . MATCH FILES /FILE=* /TABLE='C:\Program Files\SPSS\aggr.sav' /BY CLU5_1. exe.

*Compute the Euclidean distance case-centroid. comp distance = 0. do repe centr = Cmpg to Caccel /case = Zmpg to Zaccel. - comp distance = distance + (centr-case)**2. end repe. comp distance = sqrt(distance). var lab distance "Distance case-centroid". exe.

*End of the example.

Greetings

Jan

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Mark Webb Sent: Monday, July 31, 2006 7:43 AM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Distance from cluster centre query.

In K Means it's possible to save this information as a variable. Is this possible in any of the hierarchical methods offered in SPSS ? They offer a proximity matrix - which I see as different - as this shows distances between individual respondents NOT the classification mean. Am I missing something ?

Regards

__________ NOD32 1.1684 (20060729) Information __________

This message was checked by NOD32 antivirus system.


Back to: Top of message | Previous page | Main SPSSX-L page