Date: Tue, 17 Oct 2006 13:55:47 -0700
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: score chi-square in forward variable selection in SAS
Content-Type: text/plain; format=flowed
>I am coding the logistic regression and variables selection. I used the
>SAS manual as a guide, but one of the problem I have is that I can not
>compute the right score chi-square to choose the next variable to enter
>in the model in forward selection.
>I used for the score the formula U’(γ0) I-1(γ0) U(γ0) where U is
>the vector of first partial derivatives of the log likelihood with
>respect to the parameter vector γ and I is the information matrix
>If I already have v1, v2, …vt variables in the model, I build a model
>with v1, v2, …vt, vt+1 variables which give me γ and to measure the
>significance of vt+1, I set γt+1 = 0 and compute the score then
>compare it to a chi-square distribution with one degree of freedom. Do
>anybody see something wrong with this?
>My scores are very large compare to what I obtain with SAS and the
>order of entry of the variables is not the same.
>Thanks in advance for your help.
>PS: I am working on binary problems
I have complained about this lots of times, but bear with me.
Forward selection / backward selection / stepwise selection methods
are poor choices for model-building. Please don't use them at all.
I know that does not answer your question, but it's the most
important point I have to make. Please use better model
building strategies instead, and you'll be happier with your
results. Start with regressors that are based on the theory
in your field, and the expertise of the subject-matter experts.
They know more than some dumb computer that may or may
not have good data points to work with.
As for your specific problem, it would really help if you would
provide some context. Saying that you did not get the same
answer as SAS did does not help us. I cannot tell if you used
the wrong formula, or if you substituted an incorrect value, or
if you did the computations wrong, or what. If you can show
us a ten- or fifteen-record data set (with code to read it in)
that illustrates your problem, and the formula you used, and
why it differs from the SAS output. then someone may be
able to spot what went wrong.
David L. Cassell
3115 NW Norwood Pl.
Corvallis OR 97330
Get today's hot entertainment gossip