Date: Wed, 27 May 1998 16:26:28 -0300
Reply-To: hmaletta@overnet.com.ar
Sender: "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From: "Hector E. Maletta" <hmaletta@OVERNET.COM.AR>
Subject: Re: Question About Logistic Regression - Help
Content-Type: text/plain; charset=us-ascii
Karen Scheltema wrote:
>
> Does anyone know if this is the case in SPSS?
>
> ____________________Reply Separator____________________
> Subject: Re: Question About Logistic Regression - Help
> Author: <bsc@home.com>
> Date: 5/27/98 10:17 AM
>
> As a further warning about using proc logistic on weighted data, not
> only are the measures of predictive discrimination (i.e. concordance
> statistics) incorrect, but so is the Hosmer-Lemeshow goodness-of-fit
> test. To get the right values for predictive discrimination you must use
> a freq option rather that a weight option.
This is not just the case with logistic regression. SPSS always
measures statistical significance on weighted data, thus considering
that an expanded total equals the sample size, whereas sample size is
ordinarily much smaller. This yields significance levels artificially
good. To dodge the problem, using unweighted data is often not adequate,
when cases have differente sampling probabilities calling for different
weights.
In such cases one should use weights that just 'weight' cases
differentially but do not expand sample size to the size of the entire
population. If 'n' is the sample's size, the weighted total should be
'n', not N=the estimated size of the concerned population. Ordinary
weighting factors (defined as the reciprocal of sampling probabilities)
accomplish both functions, to weight and to expand, at one stroke.
To get modified weights that only give cases differential weight without
expanding, create a new weighting variable by multiplying old weights by
n/N where n=sample size and N=estimated population size (the latter
obtained from the use of former expanding weights).
Of course, this is not completely orthodox (Mr Nichols will explain to
you that SPSS regards all samples as simple random samples) but in my
view if safe enough, though I'm eager to hear other opinions.
Hector Maletta
Universidad del Salvador
Buenos Aires, Argentina