LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 26 Jul 2006 08:54:37 -0300
Reply-To:     Hector Maletta <>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Hector Maletta <>
Subject:      Re: Logistic regression, few "respondents", and weighting in SPSS
Comments: To: Marc <>
In-Reply-To:  <>
Content-Type: text/plain; charset="us-ascii"

You can of course inflate the weight of your cases but it is not a good idea at all. When the probability of an event is low (1700 on 60000) that's tough luck, but you cannot change it without disfiguring your data. About lack of fit: one thing is lack of fit itself (Nagelkerke too low etc), another is that the classification table does not predict most of the events. The latter is because by default SPSS predicts an event when its probability by logistic regression is over 0.50, which seldom happens when the event is rare. It will probably predict "no event" (0) in all cases, missing all the cases when the events actually happened. On the other hand 1700 cases (or 60000 to be precise) are numerous enough for the results being statistically significant. That is, whatever you find will not be a sample fluke but (with 95% confidence) a true representation of what happens at population level. About weighting see my paper in the tutorials section of (go to macros or syntax and then to tutorials). Hector

-----Mensaje original----- De: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] En nombre de Marc Enviado el: Wednesday, July 26, 2006 3:39 AM Para: SPSSX-L@LISTSERV.UGA.EDU Asunto: Logistic regression, few "respondents", and weighting in SPSS

Dear all, I'm coming with a question concerning logistic regression and SPSS.

We're in front of a situation here where we have very few "respondents" (1 in the field to predict) in a logistic regression. Only 1700 on 60000. I think it's a situation called "sparsity", isn't it ?

When doing a logistic regression, we have a low fit. As I see it, it's because of this sparse dataset. I was told that a way to solve that kind of problem in LR, is to weight the responding cases, to "artificially" raise their representativity in the dataset.

I've looked that up in the classical "bibles" of logistic regression (Menard, Lemeshow, Jaccard), but haven't found any discussion of sparsity, or situations with few respondents.

In SPPS, I know there is a weight feature. Does it work with logistic regression ? Is it really a "technique" to (artificially) have a better fit ?

What modelling techniques are better suited for sparse datasets, in your opinion ?

Thank you so much for helping out !

Back to: Top of message | Previous page | Main SPSSX-L page