Date: Mon, 16 Jun 2003 13:49:33 -0300
Reply-To: Alexandre Cechin <firstname.lastname@example.org>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Alexandre Cechin <email@example.com>
Subject: Re: a question about dummy variable
Content-Type: text/plain; charset=iso-8859-1; format=flowed
So it's possible to use all dummie categories if the constant is omited?
I understand the reasons for omiting one of the dummie categories, and the
interpretation of the constant.
But it really becomes hard when you have many independent variables in the
model, to isolate and interpret the effect of one of the omited dummies into
>From: David Hitchin <D.H.Hitchin@sussex.ac.uk>
>Reply-To: David Hitchin <D.H.Hitchin@sussex.ac.uk>
>Subject: Re: a question about dummy variable
>Date: Mon, 16 Jun 2003 10:55:26 +0100
>--On 16 June 2003 12:35 +0300 Safa Gurcan
>>I have a question about interpretation of dummy variable coefficients. So,
>>that i have a categorical variable which has 6 categories. I created 5
>>dummy variables. In the regression model, the model appears y=
>>bo+b1X1+b2D1+b2D2+b3D3+b4D4+b5D5 (x1 is a measured variable)
>>I have trouble about interpretation of sixth categorie. Where is the
>>coefficient of 6.th categories? Could you give me some clues how can I
>>interpretation the sixth categories?
>The simple answer is that for the 6th category, the value is obtained from
>the constant term.
>If you look at the equation above, you will see that if b1,b2,..,b5 are all
>zero, then the case must come from category 6. A coefficient X1 for b1, for
>example, then states that if the case comes from category 1, the value X1
>is the amount by which the expected value of the case differs from b0.
>When you are coding dummy variables, you always have to omit one dummy from
>the set, since it gives no more information. If a case is not in sets 1-5
>it MUST come from set 6.
>Technically you can consider the constant term in the equation as always
>having a value of 1. If a set of other variables also add up to exactly 1
>for every case, then you have perfect multicollinearity and the regression
>has an infinite set of solutions - which is just as much use as no
>solutions at all.
>If you explicitly want to see all of the coefficients, then you include b1
>to b6 in the regression, but you OMIT the constant term (and then be
>careful with the interpretation, because the statistics are slightly
>different for a regression without a constant term).
>The simplest dummy variable that you are likely to set up would relate to
>gender, say 0 for male and 1 for female. It is customary to make gender
>just 1 variable, but you could of course have a variable for male (yes/no)
>and one for female (yes/no). We routinely code gender in a single column,
>and interpret the coefficient of the result as the difference in expected
>value for the other gender.
MSN Hotmail, o maior webmail do Brasil. http://www.hotmail.com