Date: Fri, 13 Dec 2002 12:37:23 -0500
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: "Arthur J. Kendall" <Art@DrKendall.org>
Organization: Social Research Consultants
Subject: Re: dissimilarities in Clustering
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
What dissimilarity coefficient are you using? SPSS has more than a
couple dozen. The formulae are in the syntax manual and should be on the
algorithms section of www.spss.com. Many identical coefficients have
been reintroduced with different names in different disciplines.
You might also post a question on class-l which you can get to on
You can post the name you have for your coefficient and the formula if
you have it and ask whether the same coefficient has other names.
Hope this helps.
Social Research Consultants
University Park, MD USA
Don Eduardo Miranda wrote:
> I am trying to cluster some variables using hierarchical clustering.
However, i only possess the matrix of dissimilarities, not the actual
data matrix (variables * cases) .
I was wandering if i could pass this dissimilarity matrix as the
argument of the CLUSTER command.
I have seen this done in some examples which use the Proximities command
but i wonder on the correctness of this approach.
My confusion comes from the fact that the cluster function also expects
a dissimilarity measure type and asumes SEUCLID as default, and
sometimes (actually in most of the cases) i have seen this command
applied to the original data matrix , so i assume this function
internally generates a dissimilarity matrix based on its parameter
matrix. So if i pass my dissimilarity matrix as the argument, the
actual algorithm would be working on the dissimilarities of the
dissimilarities matrix (something like a meta-dissimilarity) and well,
once on that level i have no idea of what to expect from my clustering.
Additional to this, the dissimilarity measure i am using does not
belong to the set of dissimilarity measures supported by SPSS, so i dont
know whether assuming the dissimilarity measure to be the default one
(if i dont indicate any dissimilarity measure) will alter my results as
some steps of the clustering algorithm will need to re-calculate the
distances once the clusters are being formed.
> Could you please help me clarify this?
> thank you very much
> Eduardo Miranda
> Departamento de Informatica da FCT/ UNL
> Quinta da Torre, 2829-516 Caparica, Portugal
> Tel: +351-21 294 85 36 - Ext. 10731
> Fax: +351-21 294 85 41
> E-mail: email@example.com