Here's another possible way of looking at it. I believe you want to compare whites to the other groups. One way of doing this is to compare whites to the grand mean . . the total group mean that you are surveying. One common approach to answering this question is effect coding.
Effect coding is basically used to compare the 'effect' of being in a category compared to the unweighted mean of the expected value for all categories. The unweighted mean is often similar to the weighted mean. Effect coded variables are created by assigning the values -1 to the reference category instead of 0, coding the category you're interested with 1 and leaving the other categories with 0.
With this formula R-square does not change, however the intercept now takes on the value of the unweighted grand mean.
In your case you would probably do as follows:
Since you want to qualify the effect of being white with the beta, you need to include white as a predictor. Then choose any other category as your reference group (code -1), because in this case the intercept will take on the unweighted grand (all categories) mean. As a general rule, you would want your reference group to be fairly large however, so that it has enough replicates to make a decent estimate of the dv within the group.
*example if you chose blacks as a reference group, , , untested code.
*the do-if is a safety measure to not include missing data by accident, so do if not missing race.
do if ~missing(race).
if race =1 ewhite = 1.
if race =2 ewhite =-1.
if race =3 elatino = 1.
if race =2 elatino =-1.
if race =4 easian = 1.
if race =2 easian =-1.
if race =7 eother = 1.
if race =2 eother =-1.
*Here, each dummy variable consists of 3 categories 1,0,-1. Include each of the dummy variables in the model except the reference group (in this case, with this coding, that group is blacks). The individual coefficients are still measuring the effect of being white or of another race on the dv, but now the reference category is the unweighted grand mean, not blacks.
*NOTE: This will also change some of your siginicance values for the betas. A group that was once significantly different from the reference group may not be sig, dif, from the grand mean.
Any feedback from the list would be much appreciated . . I would definitely check this scheme out and compare it to your needs. You may have other considerations that I have not thought of.
*Hope this helps!
From: Jessica Kenty [mailto:email@example.com]
Sent: Fri 8/9/2002 3:43 PM
Subject: Dummy Variables in Regression
I am running a regression & I have to use race for an IV. The breakdown for
race is as follows:
So if I create dummy variables for each of the above codes, resulting in 5
total race dummy variables - my question is how many of these MUST be
entered into the regression equation. I believe you must enter 4 out of the
My main interest is in demonstrating how much more of my DV whites possess.
Thus my reference group is whites - but I want to use the betas to talk
about whites - Not blacks, latinos, asian, nor the "other" category. Thus
if I put each of these dummies into the regression, excluding the "white"
dummy, I will have a difficult time talking about whites.
Any advise would be great.
Assets & Educational Inequality Project
Department of Sociology and Anthropology