Date: Thu, 23 Oct 2003 13:57:41 -0700
Reply-To: Karriere Sucher <sassysaser@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Karriere Sucher <sassysaser@YAHOO.COM>
Subject: Re: chi square analysis to identify the outliers
In-Reply-To: <OFF515F526.9C4C65D1-ON88256DC7.007D173E@epamail.epa.gov>
Content-Type: text/plain; charset=us-ascii
>> Perhaps you could clarify why you need a chi-square test, and
>> what KIND of chi-square test you are talking about? This isn't
>> a homework problem, is it?
No, this is not a homework. I am not a statistician and I briefly remember that there is a so called chi-square test for outliers. I may be wrong and that is why I am asking. But this is not a homework problem.
>> [1] You computed the CI incorrectly. 30.4 is *NOT* the standard error
>> of the mean that you need to use in your CI. The correct CI does not
>> get anywhere nar 0 or 100.
Can you enlighten on what the correct standard error is in this case and how to calculate it? Thanks a lot!
"David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV> wrote:Karriere Sucher wrote:
> I have a sample of math exam scores for 10 students. They are 8, 25,
35, 41, 50,
> 75, 75, 79, 92, 99. How to use SAS to conduct a chi square analysis to
find out
> whether the student with the score 99 and the student with the scroe 8
are two
> outliers.
A chi-square to look for outliers in small samples? That doesn't
seem to be reasonable. Grubbs' test (which uses an F distribution
under the hood) is a good example of why you shouldn't be using
a parametric test on teeny samples: when n < 6, it is notorious
for leading people to reject *most* of the data, whether there are
real outliers or not.
Perhaps you could clarify why you need a chi-square test, and
what KIND of chi-square test you are talking about? This isn't
a homework problem, is it?
What underlying assumptions are you willing to make about the
distribution of grades? If you assume the grades should be
distributed normally, then you have some parametric tests available
to you. There are also some non-parametric tests available.
> My second question is I calculated the mean to be 57.9 and
the standard
> deviation is 30.4. Then 95% CI will roughly be the range of -3 ~ 119
(mean plus /
> minus two standard deviations). But in reality the scroe can only be
in the range
> of 0 ~ 100. How to interpret the negative score and the score that is
greater than 100.
[1] You computed the CI incorrectly. 30.4 is *NOT* the standard error
of the mean that you need to use in your CI. The correct CI does not
get anywhere nar 0 or 100.
[2] If you do have a confidence interval that exceeds the reasonable
interval, then you view the part above 100 (or the part below 0) as
extraneous.
HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
---------------------------------
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search