**Date:** Sat, 17 Sep 2005 17:04:56 +0200
**Reply-To:** Karl Koch <TheRanger@gmx.net>
**Sender:** "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
**From:** Karl Koch <TheRanger@gmx.net>
**Subject:** Linear Regression (2nd try)
**Content-Type:** text/plain; charset="us-ascii"
Hello list,

I have designed, performed, and analysised a 3x3x2 full factorial
experiment. For the analysis I have done an analysis with repeated ANOVA.
All main factors and all interactions were stat. significat.

Now, for a second purpose I need a different form of analysis. I need a
function that can predict future dependent variables based on the choice of
factorial levels. The dataset of the previously mentioned experiment was
rich (over 2300 data points) and should have a decent power. I basically
want a function that allows me to estimate Y based on my three factors (X1,
X2, and X3). My factor levels are either continous or descrete (which will
be transformed into contineous).

I think regression analysis is the right tool for my purpose. However, I
have not worked with this analytical tool before. I have done some reading
already but not much of it in SPSS.

Therefore, I have some questions:

1) Based on the experimental design, is the multiple linear regression the
right statistical method to use for this purpose (assuming that the
relations are linear)? Or are there other approaches in SPSS that could do
equally well in order to obtain a function that can best explain and predict
future values?

2) According to my ANOVA results, confirmed by my contrasts, I have
interaction effects in all my two-way and also in my three-way interaction.
The bottomline is, this dataset is quite interactive. However the
interaction effects do not appear to be very strong. Is this a problem for
regression analysis? If yes, how can it be tackled with? Are ther any aids
that could be used to account for that?

3) Since I have three IVs, based on my DOE, could overfitting occur. How
could I compensate that witout taking out IVs? Can overfitting occur by
putting in too many results?

4) Does somebody know a good tutorial for doing this kind of multiple linear
regression (based on the 3 IV and the 1 DV) with SPSS? Are there good books
or articles or other online resources that especially cover this?

Kind Regards,
Karl

