LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2005)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 11 Feb 2005 14:14:26 -0800
Reply-To:     Markus Kemmelmeier <>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Markus Kemmelmeier <>
Subject:      Re: Statistics Question
Comments: To: David Hitchin <>
Content-Type: text/plain; charset="iso-8859-1"

To follow up on David Hitchin's comparison between ANOVA performed on continuous data and on dichotomous data: Consistent with this own results, there's a paper in the literature that suggests that the two converge rather nicely "where cell frequencies are equal under the following conditions: (a) the proportion of responses in teh smaller response category is equaly to or greater than .2 and there are at least 20 degrees of freedom for error, or (b) the proportion of responses in the smaller response category is less than .2 and there are at least 40 degrees of freedom for error" (from Lunney, G. H. (1970). Using analysis of variance with a dichotomous dependent variable: An empirical study. Journal of Educational Measurement, 7, 263-269.)


From: SPSSX(r) Discussion on behalf of David Hitchin Sent: Fri 2/11/2005 4:31 AM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: Statistics Question

Quoting Marta García-Granero <>:

> MK> Yours is a good example of the fact that ANOVA is not nearly > MK> as robust against violations of normality as is often believed, > MK> e.g., in my own field of social psychology. ANOVA is fairly > MK> robust against violations of kurtosis, but is much more sensitive > MK> toward violations of symmetry. > > I have read just the opposite. ANOVA is considered to be quite > robust against violations of symmetry

I had a colleague who produced results from a rather large repeated measures design in which the dependent variable took only the values zero and one. He had analysed it using conventional ANOVA, but was doubtful about the calculated p-value.

I set this up for a randomisation test, and gave it more than an hour's worth of CPU time on a big machine. The p-value came out identical to the third decimal place.

Now of course, while a zero-one distribution cannot produce normally distributed residuals, neither can they be extremely asymmetrical and there can't be large outliers, and there is no guarantee that other similar experiments would produce ANOVA p-values anywhere near as close to randomisation p-values.

I always begin by looking at the data, of course, and then a quick ANOVA may be sufficient - if the p-value is 0.0001 or 0.888 then there is little doubt about whether the results are significant at the 5% level. If they are hovering in the 2%-10% range, then it's worth thinking much more carefully about the analysis.

As Marta wrote, it's important to look for normality WITHIN each of the subgroups - you don't need normality in the sample as a whole.

In my view Kolmogorov-Smirnov, Shapiro-Wilk and Homogeneity of Variance with Levene Statistic don't tell you much that you can't see far more clearly by plotting the data, where box-plots give you nearly all that you need to know. The p-values from the tests are as much or more related to sample size as to how non-normal the residuals are. You can get highly significant results from large samples in which the non- normality is so slight as to be no problem for conventional ANOVA tests.

David Hitchin

Back to: Top of message | Previous page | Main SPSSX-L page