**Date:** Thu, 23 Oct 2003 16:50:22 -0700
**Reply-To:** cassell.david@EPAMAIL.EPA.GOV
**Sender:** "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
**From:** "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
**Subject:** Re: chi square analysis to identify the outliers
**Content-type:** text/plain; charset=US-ASCII
Karriere Sucher <sassysaser@YAHOO.COM> kindly replied:
> No, this is not a homework. I am not a statistician and I briefly
remember
> that there is a so called chi-square test for outliers. I may be wrong
and
> that is why I am asking. But this is not a homework problem.

That's quite reassuring. I was somewhat concerned.

There isn't a simple chi-squared test for univariate outliers. And
if there were, it would probably be assuming normally-distributed data,
which is always a problem with really small data sets.

I don't think that you can reasonably rule out your high value when you
can't really assume normality in your data. (You have already thought
about
the fact that your data are restricted between 0 and 100, and a true
normal distribution wouldn't have that restriction.) With a standard
deviation of 30.4 , all your points are within two standard deviations
of your sample mean.

I recommend that you trying plotting the data. Try something simple,
like:

proc univariate data=yourdatasetname plot normal;
var yourvariable;
run;

You'll see that with this small a data set and this much variability,
there are no serious outliers showing up on the boxplot. The q-q plot
doesn't look that bad for 10 points. None of the normality tests will
reject the assumption of normality with this few points being shaped
in a nice mound-shape.

I would certainly say that 8 out of 100 is a bad grade. But it doesn't
look like an outlier given the rest of the data. Sorry.

>> [1] You computed the CI incorrectly. 30.4 is *NOT* the standard error
>> of the mean that you need to use in your CI. The correct CI does not
>> get anywhere near 0 or 100.
>
> Can you enlighten on what the correct standard error is in this case
and how to calculate it? Thanks a lot!

Okay, since this isn't a homework problem, I will. You forgot to divide
your standard deviation by the square root of n to get the standard
error
of the mean. So the real CI will be less than one-third the width of
the one you came up with.

HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician