LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 8 Feb 2008 14:48:28 -0800
Reply-To:     "Dennis G. Fisher, Ph.D." <dfisher@CSULB.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Dennis G. Fisher, Ph.D." <dfisher@CSULB.EDU>
Subject:      Re: Hierarchical agglomerative clustering - a couple of questions
Comments: To: "cat.." <cat.b41@GMAIL.COM>
In-Reply-To:  <181cc41e-b4ac-476a-8102-6dd7c65ba50e@d21g2000prf.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

The algorithm is not really the issue here. The issue is the proximity measure. The measure that is specifically for this purpose is Gower's coefficient. Other measures have been "forced" into service for these applications, but their use may sometimes be questioned by knowledgeable reviewers. As far as the number of variables to use for cluster analysis, this can vary quite a bit. Using too few will make the analysis trivial, and using too many may make the results hard to interpret. A "sweet spot" may be from 6-18 variables for a "nice" cluster analysis although people have done good analyses with more or fewer than this. A good reference is Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis. Beverly Hills: Sage. HTH Dennis Fisher

cat.. wrote: > Dear SAS L-ers, > > I was wondering whether a cluster analysis method exists that is > capable to detect groups of patients when they are caracterised by > both categorical xand continuous variables. What is the distance used > between patients ? What is the algorithm for merging clusters > together ? Any reference for that ? > > Another question I was wondering is: I've heard that it was > recommended to not use too many variables to define clusters. Does one > know a rule of thumb of the maximal number of descriptors to use > according to the sample sze ? Any reference for that ? (FYI: This > topic is not considered in the book "Statistical Rules of Thumb") > > Thank you very much. > > Catherine. > >

-- Dennis G. Fisher, Ph.D. Professor and Director Center for Behavioral Research and Services 1090 Atlantic Avenue Long Beach, CA 90813 Ph: 562-495-2330 x121 Fax: 562-983-1421


Back to: Top of message | Previous page | Main SAS-L page