Date: Thu, 4 Nov 2004 13:37:48 -0500
Reply-To: "Thompson, Carol" <CThompson@anteon.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: "Thompson, Carol" <CThompson@anteon.com>
Subject: Re: Question RE: Logistic Regression Results
Content-Type: text/plain; charset="us-ascii"
Keith,
There are two situations that can affect the logistic regression as you
described, complete and quasi-complete separation. The first one will
likely kick out the predictor because it is a situation where there is
perfect prediction between the predictor and the dependent variable,
e.g., if x < 3.5, y = 1 and if x >= 1, y = 0. Quasi-complete separation
occurs when there is complete separation except for a single value of
the predictor for which both values of the dependent variable occur. In
either case, the MLE either doesn't or may not exist. Including the
predictor with quasi-complete separation makes the validity of the model
fit questionable similar to what you would see in a regression analysis
with such a high coefficient and large standard error. Some authors
suggest combining predictor categories, if possible. If that is not
possible, I would suggest excluding the predictor and annotating your
table of coefficients to indicate the situation that exists. This is
what I did last year where we had several such situations. In your
description of the results, you can address that the joint distribution
of the data does not allow the predictor to be formally included in the
model, however, you can say whether the direction based on that
distribution follows or counters that for the other models.
I'm not an expert in logistic regression, but this is what I learned and
used for one of my projects. Hope this helps some.
Carol
Carol B. Thompson
Sr. Programmer/Analyst
Anteon Corporation
4220 S. Maryland Parkway, Suite 408B
Las Vegas, NV 89119
Ph: (702) 731-5550 x 207
Fax: (702) 731-4027
-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
Keith Dooley
Sent: Thursday, November 04, 2004 7:57 AM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Question RE: Logistic Regression Results
Hello all,
I conducted a series of logistic regression analyses predicting
variables Y1 thru Y4 (various forms of elder abuse, where 0=not present
and 1=present) using the same set of variables, some continuous and
others discrete, as predictors. My problem is this: in the output for
variable Y4, one of the dichotomous predictor variables (co-residence,
where 0=no and 1=yes) produces an untenable odds ratio (45475059) and a
confidence interval for this value is not computed. Also, under the
Model Summary, the output contains a statement that "..maximum
iterations have been reached. Final solution cannot be found." In all
other logistic regression results (Y1 thru Y3), this problem does not
occur.
I investigated the joint distribution of co-residence and variable Y4,
and it happens that for value 0 of co-residence there is no variability
in Y4 (all values are 0), and for value 1 of co-residence there is very
little variability in Y4 (122 of 138 cases are 0).
I'm wondering if the joint distribution of these variables is causing
the model not to reach a solution. The more important question is: How
should I explain (in the process of reporting these results in a
manuscript) the "wacky" results for Y4? Should I exclude the problematic
variable from the one regression model, and how do I justify using that
variable in all of the other models but not this final one?
Any advice would be appreciated.
Keith Dooley
UGA Psychology
|