Date: Thu, 27 Jul 2006 07:57:31 +1000
Reply-To: Jason Burke <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Jason Burke <firstname.lastname@example.org>
Subject: Re: Logistic regression, few "respondents", and weighting in SPSS
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Have you considered, splitting data into train / test partitions, then
combining all of the respondnts in your training partition with a
random samplle of the non-respondents in the same partition? With the
model, apply it against the test partition.
On 7/26/06, Spousta Jan <JSpousta@csas.cz> wrote:
> Hi Marc,
> >In SPPS, I know there is a weight feature. Does it work with logistic
> regression ?
> Yes, it works well together.
> >Is it really a "technique" to (artificially) have a better fit ?
> Yes, you get a "better" fit, but in a sense it is rather self-deception.
> In reality, the fit is still bad and you cannot rely on the results.
> >What modelling techniques are better suited for sparse datasets, in
> your opinion ?
> SPSS has its exact tests, they are devised for sparse data. Of course,
> they cannot create significant results where there is nothing
> Moreover, I do not understand your phrase "1700 on 60000" (sorry for my
> bad English). If it means that you have 60,000 respondents and that 1700
> of them has 1 in the dependent variable and the rest has 0 here, then
> the case is not about sparsity. 1700 is enough for most practical
> purposes and you can use logistic regression without desperation. If its
> result is not significant, then it simply means that your "dependent"
> variable does not depend on the selected predictors.
> Hope this helps
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
> Sent: Wednesday, July 26, 2006 8:39 AM
> To: SPSSX-L@LISTSERV.UGA.EDU
> Subject: Logistic regression, few "respondents", and weighting in SPSS
> Dear all,
> I'm coming with a question concerning logistic regression and SPSS.
> We're in front of a situation here where we have very few "respondents"
> (1 in the field to predict) in a logistic regression. Only 1700 on
> 60000. I think it's a situation called "sparsity", isn't it ?
> When doing a logistic regression, we have a low fit. As I see it, it's
> because of this sparse dataset. I was told that a way to solve that kind
> of problem in LR, is to weight the responding cases, to "artificially"
> raise their representativity in the dataset.
> I've looked that up in the classical "bibles" of logistic regression
> (Menard, Lemeshow, Jaccard), but haven't found any discussion of
> sparsity, or situations with few respondents.
> In SPPS, I know there is a weight feature. Does it work with logistic
> regression ? Is it really a "technique" to (artificially) have a better
> fit ?
> What modelling techniques are better suited for sparse datasets, in your
> opinion ?
> Thank you so much for helping out !