LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (July 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 24 Jul 2008 10:00:22 +0200
Reply-To:     Marta García-Granero <>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Marta García-Granero <>
Subject:      Re: Multiple Regression with Continuous and Categorical Variables
In-Reply-To:  <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Briana H. Witteveen escribió: > I know that to use categorical independent variables in multiple > regression you must create dummy variables. How do you include the dummy > variables in a multiple regression model that also includes several > continuous independent variables? Is it possible to use dummy variables > and continuous in a stepwise regression? > Briana:

Scott Millis has already given you a decalogue of reasons for avoiding stepwise regression. I could add one to his collection of reasons: stepwise regression doesn't handle properly dummy coded categorical variables. The final model might lack one of the dummy variables, rendering the effect of the categorical variable uninterpretable. Stepwise regression has been ironically called "unwise" regression (Leamer, 1985). Avoid it. Period.

Take a look at chapter 4 of the book "Applied Logistic Regression" Hosmer&Lemeshow (1989). They give excellent guidelines to model development. Basically:

1) Univariate analysis

2) Select those variables that should be included for next step: - Those that showed interesting results in univariate analysis (this doesn't necessarily mean "significant") - Those that your experience tells you that they might play an important role (confounding and/or effect modifier). In Epidemiology/medical research, gender and age are typical variables.

3) Build a model with all the variables you selected in the previous step. Examine their adjusted effect and remove carefully those that look non important. Check the effect of the removal of one variable in the slopes of the rest. Important changes (above 10% is a good reference) will show you that the variable you removed plays a role in the model and should stay in it. If you suspect a variable is involved in interactions (see next step), it should never be removed (hierarchical rule). The final model is called the "main effects model"

4) Examine the existence of interaction between variables. Limit the interaction terms according to these conditions: - They should be statistically significant - Meaningful: if you can't explain from a solid theoretical point of view the presence of the interaction, then discard it - Hierarchical rule: if an interaction term is present in a model, then both main effects should also be. Stepwise regression tends to mess with the rule, BTW

Your final model should be then validated (using an independent dataset).

Quoting Campbell (Statistics at Square Two, 2001): 'Do not forget that models are simply an approximation to reality. "All models are wrong, but some are useful" '

HTH, Marta García-Granero

-- For miscellaneous statistical stuff, visit:

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Back to: Top of message | Previous page | Main SPSSX-L page