Date: Tue, 20 Sep 2005 12:28:51 -0700
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: Cluster trim parameter
In-Reply-To: <200509201832.j8KI4ZLn029564@malibu.cc.uga.edu>
Content-Type: text/plain; format=flowed
miguel_hoz@YAHOO.ES wrote:
>I realize that I have outliers problem in a cluster analysis so I am trying
>the trim=5 option to cut the 5% of the sample, but I get some errors... Do
>you have the solution to this...
>
>proc cluster data=cluster outtree=tree method=ward print=15 ccc pseudo
>standard trim=5;
>
>var var1 var2 var3 var4 var5;
>
>id area_name;
>
>run;
I'm not convinved that you have outlier problems. Well, not from where I
sit,
way over here. :-)
Sure, you may have highly non-spherical data. But that doesn't mean your
points are outliers. They may simply belong to a different, very small
cluster
(perhaps of size 1, even). You wouldn't want to lose that informaiton,
since
that tells you something important about that area. Or you may have a
clustering problem which is not suitable to the aggregation method you
chose. Ward's method is *very* sensitive to:
[1] outliers
[2] small clusters
[3] non-normality.
And you seem to have all three conditions. So the problem may be that
you need to use a different METHOD= option.
HTH,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330
_________________________________________________________________
Don’t just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/