LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2004)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 1 Jun 2004 16:56:22 -0400
Reply-To:     Art@DrKendall.org
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Art Kendall <Arthur.Kendall@verizon.net>
Organization: Social Research Consultants
Subject:      Re: Cluster analysis and normality
Comments: To: William Dudley <william.dudley@nurs.utah.edu>
In-Reply-To:  <s0bc82fc.028@gwdom2-med.med.utah.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Cluster analysis is an exploratory procedure. There is no assumption of normality. To draw the parallel coordinate (aka profile graphs) , it necessary to have all of the variables on a common scale. You already have that.

Rarely is it appropriate to use just one procedure and one similarity measure. You should try different approaches to assure you results are not particular to a method-coefficient combination. Treating the data as categorical and as continuous in different runs will give you some insight into the "reality" of your clusters.

Discriminant function analyses, ignoring conventional interpretation of the tests, are very useful in interpreting solutions. You would use the cluster membership as the group variable and the variables the clustering was based on as the predictors.

This could be a very interesting application. I would like to hear what you come up with.

Art Art@DrKendall.org Social Research Consultants University Park, MD USA (301) 864-5570

William Dudley wrote:

>I have symptom severity data across 10 symptoms (e.g. pain, anxiety, >etc), with responses ranging from 1 (mild ) to 10 (severe) > that I would like to cluster analyze (my N is about 500). >My problem is that not everyone has all symptoms. >As many as 80% might report NOT having the symptom. >In no case are the symptoms exclusionary as we might have if we had >both males and females and asked about prostate enlargement for >instance. >That is there is always some non zero probability that a given patient >may exhibit a symptom. > > >If I recode NOT having the symptoms as a zero and create a new score >ranging from 0 to 10, then I get >very non normal distributions. Even if I recode the 1 - 10 scores into >mild moderate of severe, I end up with non normal distributions. >If I only use those cases reporting all symptoms, I end up losing over >90% of my sample. >Of course, this scale is only approximately interval level, even before >the recoding. l > >I have thought of using the two step cluster analysis which allows for >categorical variables, however, the symptom severity numbers, >although non normal are certainly NOT categorical. > >My question, "How robust are the cluster analysis routines to >deviations from normality?".. > >or > >"Any suggestions on how to proceed?" > > >Thanks in advance, >Bill > > >********************************************************************** > > > William N. Dudley, PhD > Emma Eccels Jones Nursing Research Center > > University of Utah > > College of Nursing > > 10 South 2000 East > > Salt Lake City, UT 84122-5880 > > http://www.nurs.utah.edu/faculty/william_dudley.htm > >********************************************************************** > > >


Back to: Top of message | Previous page | Main SPSSX-L page