LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 26 Jul 2006 14:15:22 -0300
Reply-To:     Hector Maletta <>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Hector Maletta <>
Subject:      Re: Addition of covariates in forward regression analyses
Comments: To: "Peters Gj (PSYCHOLOGY)" <>
In-Reply-To:  <>
Content-Type: text/plain; charset="us-ascii"

The criteria for inclusion (as explained in the SPSS documentation) are two: Significance of the new variable (contribution to explaining variance of the dependent variable) and tolerance (related to colinearity. The significance criterion is F (additional variance explained relative to total residual variance from the previous step). It can be fixed as an absolute F value (default is 3.84) or as a probability (default is 0.05). The tolerance indicator is the proportion of variance in the new variable that is NOT explained by other variables in the equation. When the tolerance indicator is below a certain threshold (default 0.0001) the new variable is not accepted because it is close to an exact function of the other variables. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] En nombre de Peters Gj (PSYCHOLOGY) Enviado el: Wednesday, July 26, 2006 1:24 PM Para: SPSSX-L@LISTSERV.UGA.EDU Asunto: Addition of covariates in forward regression analyses

Dear list,

[if this question is inappropriate (as it discusses a topic not limited to SPSS) please tell me; I could not find rules prohibiting this online]

In forward selection multiple linear regression, which of these factors influence whether a covariate is added to the model?

- the size of the regression weight the covariate would get - the standard error of that regression weight - the complete sample size

I suspect that both the size & standard error of the regression weight are of influence, and that the sample size influences the standard error of the regression weight.

If you don't want to know why I'm asking this, you can stop reading now :-) In any case thanks in advance :-)

Why I want to know this:

I am conducting several very exploratory regression analyses, regressing the same covariates on the same criterion in a number of different subsamples (persons with a different value on a certain variable; in this case for example ecstasy use status (non-users, users & ex-users)). I use the forward method to probe which covariates yield a significant addition to the model. The covariates are placed in six blocks (on the basis of theoretical proximity to the criterion; the idea is that more distal covariates only enter the model if they explain a significant portion of the criterion variance over and above the more proximal covariates already in the model). P to enter is .05. (peripheral question: am I correct in assuming that this is the p-value associated with the t-value of the beta of the relevant covariate?)

The sample sizes of the samples are unequal (e.g., ranging from 200 to 500). I get the strong impression that the number of covariates in the final model depends on the sample size. This would imply that covariates with less 'impact' would be added to the model when the model is developed with a larger sample (e.g., with equal standard errors of the parameter weight, when a covariate increases 1 standard deviation, an increase of the criterion of 0.2 * Y's standard deviation could suffice (lead to inclusion) with n=500, but not with n=200).

If this correct? And if so, is there a way to 'correct' the p-to-enter for sample size, so that all final models comprise covariates with roughly equal relevance? (except for selecting sub-subsamples from all subsamples of the size of the smallest subsample)

My goal in the end is to cursorily compare the models in the different subsamples (no, sorry, I'm not going to use SEM; given the amount of potential predictors, the sample sizes are too small). This is not very 'fair' if the model in one subsample has lower thresholds for 'inclusion' than the model in another.

If what I'm trying is completely insade/stupid/otherwise unadvisable, I'm of course eager to learn :-)

Thanks a lot in advance (if nothing else, for reading this far :-)),

Gjalt-Jorn _____________________________________ Gjalt-Jorn Ygram Peters

Phd. Student Department of Experimental Psychology Faculty of Psychology University of Maastricht

Back to: Top of message | Previous page | Main SPSSX-L page