```Date: Fri, 23 Sep 2011 14:58:06 -0500 Reply-To: "Swank, Paul R" Sender: "SPSSX(r) Discussion" From: "Swank, Paul R" Subject: Re: What am I losing using a logistic regression In-Reply-To: Content-Type: multipart/alternative; How about a negative binomial distribution or perhaps a zero inflated negative binomial if the number of zero responses is too large? Dr. Paul R. Swank, Children's Learning Institute Professor, Department of Pediatrics, Medical School Adjunct Professor, School of Public Health University of Texas Health Science Center-Houston From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Rich Ulrich Sent: Friday, September 23, 2011 1:02 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: What am I losing using a logistic regression I agree with Hector, that zero is very often important, and it makes sense to at least consider taking it alone, "none" versus "some". And also, that it is wasteful to dichotomize. However, "mostly-zero" with scores running to 40 is not a very likely Poisson. And that reminds me that sometimes there is a reasonable distribution for the rest, once zero is excluded. Does the density decrease as scores increase, or is there some other shape to what is left? Given a variable that is merely highly skewed, it is my tendency to look for a reasonable transformation that yields something close to equal-intervals in the latent quality being assessed. Is zero reasonable as a step below 1, or is there something special about zero? It could be better to use a second variable to describe non-linearity. In this case, the simple procedure might be this -- to do one analysis for none/ some and a second analysis that *excludes* the data with zero, and uses either the 1-40 score, or a transformation of it. -- Rich Ulrich ________________________________ Date: Fri, 23 Sep 2011 13:46:50 -0300 From: hmaletta@fibertel.com.ar Subject: Re: What am I losing using a logistic regression To: SPSSX-L@LISTSERV.UGA.EDU Dichotomizing necessarily involves losing information. Now in your case what you appear to have is a sort of Poisson distribution, where the most frequent event is zero, then rapidly decreasing numbers in the range 1-5, and even less in higher values. Thus you may want to use Poisson regression. On the other hand, if you must dichotomize, why not dichotomizing at "zero" and "1 or more"? Seems more reasonable to me, without knowing the actual content of your research. The value 5 does not seem to have any intrinsic characteristic to make it the critical value, especially because most of those below 5 are actually zero. Hector [snip, previous] [text/html] ```

Back to: Top of message | Previous page | Main SPSSX-L page