Date: Thu, 23 Oct 2003 16:50:22 -0700
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: chi square analysis to identify the outliers
Content-type: text/plain; charset=US-ASCII
Karriere Sucher <sassysaser@YAHOO.COM> kindly replied:
> No, this is not a homework. I am not a statistician and I briefly
> that there is a so called chi-square test for outliers. I may be wrong
> that is why I am asking. But this is not a homework problem.
That's quite reassuring. I was somewhat concerned.
There isn't a simple chi-squared test for univariate outliers. And
if there were, it would probably be assuming normally-distributed data,
which is always a problem with really small data sets.
I don't think that you can reasonably rule out your high value when you
can't really assume normality in your data. (You have already thought
the fact that your data are restricted between 0 and 100, and a true
normal distribution wouldn't have that restriction.) With a standard
deviation of 30.4 , all your points are within two standard deviations
of your sample mean.
I recommend that you trying plotting the data. Try something simple,
proc univariate data=yourdatasetname plot normal;
You'll see that with this small a data set and this much variability,
there are no serious outliers showing up on the boxplot. The q-q plot
doesn't look that bad for 10 points. None of the normality tests will
reject the assumption of normality with this few points being shaped
in a nice mound-shape.
I would certainly say that 8 out of 100 is a bad grade. But it doesn't
look like an outlier given the rest of the data. Sorry.
>>  You computed the CI incorrectly. 30.4 is *NOT* the standard error
>> of the mean that you need to use in your CI. The correct CI does not
>> get anywhere near 0 or 100.
> Can you enlighten on what the correct standard error is in this case
and how to calculate it? Thanks a lot!
Okay, since this isn't a homework problem, I will. You forgot to divide
your standard deviation by the square root of n to get the standard
of the mean. So the real CI will be less than one-third the width of
the one you came up with.
David Cassell, CSC
Senior computing specialist