|
Hi.
There has been all sorts of advice on the wisdom of keeping or removing
outliers. Some of that advice has included discussion of what actually
constitutes an outlier. It's a fuzzy definition, but whatever one ends up
doing, part of exploratory data analysis should include detection of
outliers. Paige Miller and Ron Fehd in particular mentioned SAS
procedures to use in outlier detection. I'd like to add a bit to this. I
am of the school that believes outlier detection is best practiced as
a "holistic" discipline, based on all the data to analyze, rather than
looking at each variable separately. The ROBUSTREG and PLS PROCs
mentioned by Paige allow one to do this (I didn't check the program link
Ron provided). There are also some functions in PROC IML that take a
holistic approach to outlier detection, MCD and MVE. It's cool stuff, but
has a bit of a steep learning curve.
-- TMK --
"The Macro Klutz"
On Wed, 14 May 2008 08:29:43 -0700, Eversmann <rifazrazeek@GMAIL.COM>
wrote:
>hi all,
>
>this is more of a stats question (than SAS only..)..
>
>i was wondering whats the best way to remove outliers (extreme values
>in your data)...
>
>at the moment i am using percentiles (p25, p75 etc.. in proc
>summary)... problem is what happens when you have only 2 or 3
>values... ? i am reading some sas meterial as to how this works..
>
>but in general .. what are the best ways of removing outliers... may
>be you higher level decicion makers can post a few comments...
>
>many thanks
|