LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 1997)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 4 Apr 1997 00:55:27 -0800
Reply-To:     hmaletta@overnet.com.ar
Sender:       "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From:         "Hector E. Maletta" <hmaletta@OVERNET.COM.AR>
Subject:      Re: cluster analysis question
Comments: To: "Jerry Vogt, Z4, ex. 4945 VOGTJ1 - AAL" <usaal66c@IBMMAIL.COM>
Content-Type: text/plain; charset=us-ascii

Jerry Vogt, Z4, ex. 4945 VOGTJ1 - AAL wrote: > As a part of a market segmentation study, we are > doing cluster analysis on about 3000 consumers > who responded to a mail survey. We are analyzing > about 50 of the variables in the survey. Most > of these variables involved a 1-10 preference > rating about an attribute and thus could be considered > interval data. However, a few of the variables are > nominal. In my background reading on cluster analysis, > this problem of different levels of measurement > is not extensively discussed (Churchill's > "Marketing Research" 6th ed. does devote a few > pages to the topic).

However, the discussion should be very brief: only interval measures are allowed, since cluster analysis is based on arithmetic means. However, ordinal measures might be acceptable as long as you bet the distances between consecutive ranks are not very different.

My questions are: > 1) how would you recommend I deal with this > issue within SPSS using the K-means quick cluster? > (standardize or transform the data, etc.)

It is absolutely necessary to standardize the data, producing z-scores (this is done by the DESCRIPTIVES procedure, which includes an option to copy z-scores as variables onto the working file). Otherwise, any clustering would be dependent on the particular units of measurement used for the variables (e.g. you change HEIGHT from inches to centimeters and there go your clusters...)

If you don't know in advance how many clusters you want to create, try CLUSTER on the complete file or a sample, then choose a reasonable number of clusters (k), and finally apply k-means QUICK CLUSTER.

> 2) what good background sources are recommended > for cluster analysis, especially in the context > of market segmentation work?

More than a source, a suggestion: if the variables are nominal, or they are ordinal and you have qualms about treating them as interval, try CHAID for segmentation.

> Thanks in advance > You're welcome.

Hector Maletta Universidad del Salvador Buenos Aires, Argentina


Back to: Top of message | Previous page | Main SPSSX-L page