| Date: | Wed, 18 Jun 2003 09:30:13 -0400 |
| Reply-To: | Peter Flom <flom@NDRI.ORG> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Peter Flom <flom@NDRI.ORG> |
| Subject: | Re: Outliers |
|
| Content-Type: | text/plain; charset=US-ASCII |
Again, it depends on what you are trying to do. WHY are you running
PROC UNIVARIATE? What will you DO with the output? What will you DO
with any outliers?
What does SAS mean by POSSIBLE outliers? Well, if you MUST use a strict
cutoff, the default in SAS isn't bad. But then what? Let's say you
find a lot of outliers, and that you check the data and find that it's
legitimate data (one source of outliers is data entry error, coding
problems etc). What will you do then? Why are there outliers? Is your
data a sample from some population? If so, maybe the variable in
question is heavy tailed in the population (e.g., income) in which case
MAYBE you don't want the mean, but the median? But then, maybe you
STILL want the mean. It depends what you are trying to do. Maybe the
SAS default is NOT what you want. Maybe you want to delete the
outliers, maybe not. Maybe you want to weight them somehow, or use
trimming, or winsorizing, or something.....
You must provide context.
As someone said (I forget who) "There are no routine statistical
questions, only questionable statistical routines".
Peter
Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)
|