LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2006, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 28 Feb 2006 19:21:49 +0100
Reply-To:   "adel F." <adel_tangi@YAHOO.FR>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "adel F." <adel_tangi@YAHOO.FR>
Subject:   Re: QR:cluster analysis with binary variables
Comments:   To: Rogerio Porto <rdporto1@TERRA.COM.BR>
In-Reply-To:   <000801c63c81$4ae44740$d3dca7c8@THINQ>
Content-Type:   text/plain; charset=iso-8859-1

Thanks a lot for all these informations, these are really helpful, with the information that Dennis gave. I will try these and come back to the list Adel

Rogerio Porto <rdporto1@TERRA.COM.BR> a écrit : adel F." wrote:

> I have used 12 binary variables. > First, I have done the correspondence analysis to extract axes (first > step), >the two first axes explain 34% of the variance. > I have used 9 axes in the cluster step (second step) , the 9 axes > explain >92% of the variance.

Technically you are throwing away little information (8%). You could use all the 12 axes to do your cluster analysis.

> I have used the cluster analysis, with the results from correspondence > >analysis, because my understanding is that Cluster analysis are > appropriate >for continuous variables and not for binary variables, as in > my case.

Actually, you can do cluster analysis with any kind of variable: nominal, ordinal, interval or ratio scales. For each one you have to choose an appropiated distance measure. There are tons of them and you may have to waste some time choosing the most appropriate. You can create your own distance measure but it could be a little programming.

These distance measures can be computed using a macro supplied by SAS (macro %distance): http://support.sas.com/ctx/samples/index.jsp?sid=475 If you are using SAS 9.1, you can compute the distances using the new PROC DISTANCE.

The %DISTANCE macro computes various measures of distance, dissimilarity, or similarity between the observations (rows) of a SAS data set. These proximity measures are stored as a lower triangular or a square matrix in an output data set, depending on the specification of the SHAPE=, that can then be used as input to the CLUSTER, MDS, or MODECLUS procedures.

I think this is accordingly with what Dennis Fisher said about doing cluster analysis directly using the binary variables.

HTH,

Rogerio Porto.

--------------------------------- Nouveau : téléphonez moins cher avec Yahoo! Messenger ! Découvez les tarifs exceptionnels pour appeler la France et l'international.Téléchargez la version beta.


Back to: Top of message | Previous page | Main SAS-L page