|Date: ||Tue, 21 Aug 2007 18:23:29 -0700|
|Reply-To: ||David L Cassell <davidlcassell@MSN.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||David L Cassell <davidlcassell@MSN.COM>|
|Subject: ||Re: Standardization of variables in Cluster analysis|
|Content-Type: ||text/plain; format=flowed|
>For doing cluster analysis, I am consider to standardize variables to
>control difference of scales among variables.I am going to use
>dissimiliarity measure like Mahalanobis distance.
>If I use the M-distance, by definition of it, I already weigh
>Eculidean distance with standard deviation. Thus, I think I don't need
>to standardize variables when I use Mahalanobis distance in cluster
>What do you think? Please advise me.
>Thank you in advance.
I think that you need to think more about your problem.
Choosing whether to standardize your data or not is a fundamental issue
you have to address before starting with clustering (or factors or PCA
or whatever). Is it rational given the real-world background behind your
If so, why? If not, why not? If you cannot justify it in a couple
are you sure that you are doing the right thing, and that people will not
all over your case when you do it and they do *not* like your choice?
Why are you going to use something like Mahalonobis distance? It's
often a good choice, but why is it right for *your* data, when there are
many other choices? What precisely are your data like?
Even if you're good so far, what clustering methods are right for you?
How are you going to decide on your choice of results?
David L. Cassell
3115 NW Norwood Pl.
Corvallis OR 97330
Booking a flight? Know when to buy with airfare predictions on MSN Travel.