Date: Thu, 4 Feb 2010 16:27:12 -0500 "Veitch, Jennifer" "SPSSX(r) Discussion" "Veitch, Jennifer" GLM vs MIXED with different group sizes? multipart/alternative;

GLM vs MIXED with different group sizes? [X] I have a field study in which I have 2 naturally-occurring groups exposed to different office lighting. The group sizes are quite different (e.g., N=20 and N=34). I have three DVs that are conceptually related -- all are aspects of satisfaction with the office environment.

(Ignore for the moment the need to demonstrate that there aren't confounding differences between the groups, and distributional assumptions about the DVs. Those aren't the problem I am trying to solve today.)

I can analyze this data set in two ways. I'm familiar with one of them (GLM), but think I should shift to the other (MIXED) because of the large difference in group sizes. I am having difficulty figuring out which to use, and how to read the MIXED output because it is unfamiliar to me.

The familiar way is to use GLM. Our usual practice is to test for multivariate effects (over all the 3 DVs), then to interpret significant univariate effects only if the multivariate test is statistically significant.

In my present data set, this looks like

GLM SAT_L_T0 SAT_AP_T0 SAT_VT_T0 BY LUMINR_T0.

And the portion of the output that is important to me is:

Multivariate Tests(b) LUMINR_T0 Wilks' Lambda = .802 F(3, 50) = 4.126(a) p=.011

Tests of Between-Subjects Effects SAT_L_T0 F (1, 53) = 10.227, p= .002 SAT_AC_T0 F (1, 53) = 5.526, p=.023 SAT_VT_T0 F (1, 53) = 9.884, P=.003

a R Squared = .164 (Adjusted R Squared = .148) b R Squared = .096 (Adjusted R Squared = .079) c R Squared = .160 (Adjusted R Squared = .144)

However, I know that this isn't the most accurate way to do this, because of the very unequal cell sizes. I would like to figure out how to use MIXED.

I think I would need to do this as a two-stage procedure, and furthermore will need to do some data restructuring to make it work. Do I understand this correctly?

Stage 1: The overall multivariate test.

VARSTOCASES /MAKE SAT FROM SAT_L_T0 SAT_AP_T0 SAT_VT_T0 /INDEX = DV /KEEP = ID LUMINR_T0 FURN_T0 BLDG_FLR_T0. VARIABLE LABEL SAT "SATISFACTION COMPONENT SCORE". VARIABLE LABEL DV "DEPENDENT VARIABLE #". VALUE LABELS DV 1 "SAT_L" 2 "SAT_AP" 3 "SAT_VT".

SAVE OUTFILE = 'case.sav'.

GET FILE = 'case.sav'. MIXED SAT BY LUMINR_T0 DV /FIXED = LUMINR_T0 DV | NOINT SSTYPE(3) /METHOD = REML /PRINT = CPS DESCRIPTIVES SOLUTION.

If I run this analysis then the important part of the output, which tells me that there is a statistically significant multivariate effect (over the three DVs)of LUMINR_T0, is:

Type III Tests of Fixed Effects(a) LUMINR_T0 F(1, 158) = 25.204, p= .000 DV F(2, 158) = 30.442, p= .000 a Dependent Variable: SATISFACTION COMPONENT SCORE.

(The effect of DV also tells me that the three DVs are different, but that's not interesting to me.) However, this is not the same test as the one I would run using GLM. I'm not clear on how I could get the same test (or if I can). If I accept that this is at least a reason to carry on to individual tests, then I can continue to Stage 2.

Stage 2: Three separate MIXED runs, one for each of the three DVs. This goes back to the old data file, before it was restructured. For instance, for the variable SAT_L_T0:

MIXED SAT_L_T0 BY LUMINR_T0 /FIXED = LUMINR_T0 | SSTYPE(3) /METHOD = REML /PRINT = CPS DESCRIPTIVES SOLUTION.

Type III Tests of Fixed Effects(a) LUMINR_T0 F(1, 52) = 10.227, p = .002 a Dependent Variable: Lit SatT0. (and so on, for the other two DVs)

I note that at the end of the day, the final univariate test result is the same as I obtained using GLM. GLM also gives me an effect size estimate (without need for secondary calculations). Using MIXED is a lot more work. Is it worth while? How robust is GLM to these different group sizes?