Date: Fri, 9 Aug 2002 19:05:40 -0500 "Parry, James" "SPSSX(r) Discussion" "Parry, James" Re: Dummy Variables in Regression To: Jessica Kenty , "Fink, Steven" text/plain; charset="utf-8"

Jessica,

Here's another possible way of looking at it. I believe you want to compare whites to the other groups. One way of doing this is to compare whites to the grand mean . . the total group mean that you are surveying. One common approach to answering this question is effect coding.

Effect coding is basically used to compare the 'effect' of being in a category compared to the unweighted mean of the expected value for all categories. The unweighted mean is often similar to the weighted mean. Effect coded variables are created by assigning the values -1 to the reference category instead of 0, coding the category you're interested with 1 and leaving the other categories with 0.

With this formula R-square does not change, however the intercept now takes on the value of the unweighted grand mean.

In your case you would probably do as follows:

Since you want to qualify the effect of being white with the beta, you need to include white as a predictor. Then choose any other category as your reference group (code -1), because in this case the intercept will take on the unweighted grand (all categories) mean. As a general rule, you would want your reference group to be fairly large however, so that it has enough replicates to make a decent estimate of the dv within the group.

*example if you chose blacks as a reference group, , , untested code.

*the do-if is a safety measure to not include missing data by accident, so do if not missing race.

do if ~missing(race).

compute ewhite=0.

compute eother=0.

compute elatino=0.

compute easian=0.

compute eblack=0.

end if.

if race =1 ewhite = 1.

if race =2 ewhite =-1.

if race =3 elatino = 1.

if race =2 elatino =-1.

if race =4 easian = 1.

if race =2 easian =-1.

if race =7 eother = 1.

if race =2 eother =-1.

exe.

*Here, each dummy variable consists of 3 categories 1,0,-1. Include each of the dummy variables in the model except the reference group (in this case, with this coding, that group is blacks). The individual coefficients are still measuring the effect of being white or of another race on the dv, but now the reference category is the unweighted grand mean, not blacks.

*NOTE: This will also change some of your siginicance values for the betas. A group that was once significantly different from the reference group may not be sig, dif, from the grand mean.

Any feedback from the list would be much appreciated . . I would definitely check this scheme out and compare it to your needs. You may have other considerations that I have not thought of.

*Hope this helps!

-----Original Message----- From: Jessica Kenty [mailto:jkenty@lynx.dac.neu.edu] Sent: Fri 8/9/2002 3:43 PM To: SPSSX-L@LISTSERV.UGA.EDU Cc: Subject: Dummy Variables in Regression

Hi All,

Quick question.

I am running a regression & I have to use race for an IV. The breakdown for race is as follows:

1 white 2 black 3 latino 4 asian 7 other

So if I create dummy variables for each of the above codes, resulting in 5 total race dummy variables - my question is how many of these MUST be entered into the regression equation. I believe you must enter 4 out of the 5, correct?

My main interest is in demonstrating how much more of my DV whites possess. Thus my reference group is whites - but I want to use the betas to talk about whites - Not blacks, latinos, asian, nor the "other" category. Thus if I put each of these dummies into the regression, excluding the "white" dummy, I will have a difficult time talking about whites.

Any advise would be great.

Jessica

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Jessica Kenty Research Assistant Assets & Educational Inequality Project Department of Sociology and Anthropology Northeastern University Boston, MA

Back to: Top of message | Previous page | Main SPSSX-L page