Date: Fri, 23 Sep 2011 14:58:06 0500
ReplyTo: "Swank, Paul R" <Paul.R.Swank@uth.tmc.edu>
Sender: "SPSSX(r) Discussion" <SPSSXL@LISTSERV.UGA.EDU>
From: "Swank, Paul R" <Paul.R.Swank@uth.tmc.edu>
Subject: Re: What am I losing using a logistic regression
InReplyTo: <BLU143W32A1353D503C5835AF9B44970F0@phx.gbl>
ContentType: multipart/alternative;
How about a negative binomial distribution or perhaps a zero inflated negative binomial if the number of zero responses is too large?
Dr. Paul R. Swank,
Children's Learning Institute
Professor, Department of Pediatrics, Medical School
Adjunct Professor, School of Public Health
University of Texas Health Science CenterHouston
From: SPSSX(r) Discussion [mailto:SPSSXL@LISTSERV.UGA.EDU] On Behalf Of Rich Ulrich
Sent: Friday, September 23, 2011 1:02 PM
To: SPSSXL@LISTSERV.UGA.EDU
Subject: Re: What am I losing using a logistic regression
I agree with Hector, that zero is very often important, and it makes
sense to at least consider taking it alone, "none" versus "some".
And also, that it is wasteful to dichotomize. However, "mostlyzero"
with scores running to 40 is not a very likely Poisson.
And that reminds me that sometimes there is a reasonable distribution
for the rest, once zero is excluded. Does the density decrease as
scores increase, or is there some other shape to what is left?
Given a variable that is merely highly skewed, it is my tendency to
look for a reasonable transformation that yields something close to
equalintervals in the latent quality being assessed. Is zero
reasonable as a step below 1, or is there something special about
zero?
It could be better to use a second variable to describe nonlinearity.
In this case, the simple procedure might be this  to do one analysis
for none/ some and a second analysis that *excludes* the data with
zero, and uses either the 140 score, or a transformation of it.

Rich Ulrich
________________________________
Date: Fri, 23 Sep 2011 13:46:50 0300
From: hmaletta@fibertel.com.ar
Subject: Re: What am I losing using a logistic regression
To: SPSSXL@LISTSERV.UGA.EDU
Dichotomizing necessarily involves losing information. Now in your case what you appear to have is a sort of Poisson distribution, where the most frequent event is zero, then rapidly decreasing numbers in the range 15, and even less in higher values. Thus you may want to use Poisson regression.
On the other hand, if you must dichotomize, why not dichotomizing at "zero" and "1 or more"? Seems more reasonable to me, without knowing the actual content of your research. The value 5 does not seem to have any intrinsic characteristic to make it the critical value, especially because most of those below 5 are actually zero.
Hector
[snip, previous]
[text/html]
