Date: Wed, 16 Feb 2005 23:57:50 -0600
Reply-To: Jeffrey Berman <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Jeffrey Berman <firstname.lastname@example.org>
Subject: Re: nested anovas and reading expected mean squares
Content-type: text/plain; charset="US-ASCII"
On 2/16/05 12:20 PM, LUCINDA M TEAR <email@example.com> wrote:
> Hello - I submitted this last week but no one responded. Thought I'd try
> again. Thank you for any help!
>> I am running a nested ANOVA with the following design:
>> Factor A - fixed - 5 levels
>> Factor B - Fixed - 3 levels
>> Factor C(B) - Random, nested within Factor B - 3 levels per level of
>> factor B
>> AB interaction
>> AC(B) interaction
>> Error (unequal sample sizes in each cell).
>> The way I calculate the EMS,
>> A would be a pseudo F test and I should test
>> B against C(B)
>> AC(B) against Error
>> AB against AC(B) and
>> C(B) against Error
>> If I run the model within Univariate GLM as
>> /DESIGN = A B C(B) A*B A*C(B)
>> /DESIGN = A B C*B A*B A*C*B
>> From the error terms specified at the end of the output, I think that SPSS
>> tested C(B) against AC(B) and A against AC(B):
>> Error for Intercept: .984 MS(C(B)) + .016 MS(Error)
>> Error for A: .973 MS(A * C(B)) + .027 MS(Error)
>> Error for B: .990 MS(C(B)) + .010 MS(Error)
>> Error for C(B): .977 MS(A * C(B)) + .023 MS(Error)
>> Error for AB: .981 MS(A * C(B)) + .019 MS(Error)
>> Error for AC(B): MS(Error)
>> I'm not sure how to read the EMS table or some of my problem with my EMS
>> is that I didn't completely account for the different sample sizes within
>> each cell (I just called them all "n" while calculating my EMS).
>> Can anyone give me any leads here? Are my EMS calculations wrong or am I
>> not understanding the output? I'm happy to provide my EMS calculations or
>> the entire data set.
I suspect you're running into the same issue I encountered when developing
examples with SPSS for my graduate statistics course. SPSS (and SAS) use a
different model for deriving expected mean squares than that found in many
analysis of variance textbooks. SPSS uses an "unconstrained parameters"
model, whereas the derivation in the textbooks is a "constrained parameters"
model. Differences between the models arise in situations in which some
factors are random and others are fixed.
There is a spirited debate among statisticians about which model is more
appropriate or general. For competing perspectives, see:
McLean, R. A., Sanders, W. L., & Stroup, W. W. (1991). A unified approach
to mixed linear models. American Statistician, 45, 54-64.
Voss, D. T. (1999). Resolving the mixed models controversy. American
Statistician, 53, 352 - 356.
Wolfinger, R., & Stroup, W. (2000). [Letter to the editor]. American
Statistician, 54, 228-229.
David Nichols of SPSS posted about this issue some time ago and I have
attached his message below.
Personally, I tend to be persuaded by Voss's arguments in favor of the
constrained parameters model, at least in cases in which there are no empty
cells in the design. However, this is not the model used when relying on
the RANDOM subcommand of GLM (or UNIANOVA).
University of Memphis
----- Forwarded message follows -----
From: firstname.lastname@example.org (David Nichols)
Subject: Expected mean squares and error terms in GLM
organization: SPSS, Inc.
I've had a few questions from users about expected mean squares and
error terms in GLM. In particular, with a two way design with A fixed
and B random, many people are expecting to see the A term tested
against A*B and B tested against the within cells term. In the model
used by GLM, the interaction term is automatically assumed to be
random, expected mean squares are calculated using Hartley's method
of synthesis, and the results are not as many people are used to
seeing. In this case, both A and B are tested against A*B. Here's
some information that people may find useful.
It would appear that there's something of a split among statisticians in
how to handle models with random effects. Quoting from page 12 of the
SYSTAT DESIGN module documentation (1987):
There are two sets of distributional assumptions used to analyze
a two factor mixed model, differing in the way interactions are handled.
The first, used by SAS (1985, p. 469-470), can be traced to Mood (1950).
Interaction terms are assumed to be a set of i.i.d. normal random variables.
The second, used by DESIGN, is due to Anderson and Bancroft (1952). They
impose the constraint that the interactions sum to zero over the levels of
fixed factor within each level of the random factor.
According to Miller (1986, p. 144): "The matter was more or less
resolved by Cornfield and Tukey (1956)." Cornfield and Tukey derive
expected mean squares under a finite population model and obtain results
in agreement with Anderson and Bancroft.
On the other side, Searle (1971) states: "The model that leads to
[Mood's results] is the one customarily used for unbalanced data."
Statisticians have divided themselves along the following lines:
Mood (1950, p. 344) Anderson and Bancroft (1952)
Hartley and Searle (1969) Cornfield and Tukey (1956)
Hocking (1985, p. 330) Graybill (1961, p. 398)
Milliken and Johnson (1984) Miller (1986, p. 144)
Searle (1971, sec. 9.7) Scheffe (1959, p. 269)
SAS Snedecor and Cochran (1967, p. 367)
The references are:
Cornfield, J., & Tukey, J. W. (1956). Average values of mean squares in
factorials. Annals of Mathematical Statistics, 27, 907-949.
Graybill, F. A. (1961). An introduction to linear statistical models
(Vol. 1). New York: McGraw-Hill.
Hartley, H. O., & Searle, S. R. (1969). On interaction variance components
in mixed models. Biometrics, 25, 573-576.
Hocking, R. R. (1985). The analysis of linear models. Monterey, CA:
Miller, R. G., Jr. (1986). Beyond ANOVA, basics of applied statistics.
New York: Wiley.
Milliken, G. A., & Johnson, D. E. (1984). Analysis of Messy Data, Volume 1:
Designed Experiments. New York: Van Nostrand Reinhold.
Mood, A. M. (1950). Introduction to the theory of statistics. New York:
Scheffe, H. (1959). The analysis of variance. New York: Wiley.
Searle, S. R. (1971). Linear models. New York: Wiley.
Snedecor, G. W., & Cochran, W. G. (1967). Statistical methods (6th ed.).
Ames, IA: Iowa State University Press.
SPSS can be added to the left hand column. We're assuming i.i.d. normally
normally distributed random variables for any interaction terms containing
David Nichols Senior Support Statistician SPSS,
Phone: (312) 329-3684 Internet: email@example.com Fax: (312)
----- End of forwarded message -----