|
Dale also raises some good points (as usual)
I've interspersed some comments:
> However, practical significance can be very
>difficult to assess.
>
Indeed. But I think dealing with that difficulty is part
of defining a problem well; if you don't know what results
would be important, then you haven't really defined your problem.
Of course, results aren't "important" or "not important" they
vary along a continuum; still, I think this is a useful exercise.
>Now, suppose that if we compare ZIP and negative binomial
>models. The ZIP might fit the zero response probability better
>than does the NB distribution. However, the NB distribution
>may perform better in the tail area than does the ZIP. Which
>distribution is better may not have a clear answer based only
>on visual examination of the estimated probabilities compared
>to the observed probabilities.
>
This is certainly true. But, if we find that this is the case
(fit varies by response) then we have more substantive thinking
to do .... perhaps neither the NB nor the ZIP is correct? Or
perhaps one type of response is more important?
>It may be necessary to construct a statistical test which lays
>out explicit criteria for assessing model fit. Likelihood
>ratio tests can be employed to compare nested distributions.
>For instance, the Poisson model is nested within both the ZIP
>and negative binomial models. Likelihood ratio tests can be
>employed to assess whether the unconstrained parameters in the
>ZIP and negative binomial models result in improved fit over
>a Poisson model which constrains parameters. (Note that the
>parameters which are constrained in going from a ZIP to a
>Poisson model are different from the parameters which are
>constrained to go from a NB to a Poisson dist.) Likewise, a
>likelihood ratio test can be employed to test whether a ZINB
>model performs better than does a simple NB model or a ZIP
>model.
>
Indeed all these tests can be done. But I worry about them.
What assumptions do they rely on? I believe (but correct me if I
am wrong) that these tests are asymptotically correct. But where is
the asymptote? N = 100? 500? 1000?
Perhaps some people needn't worry - if you work in a field where huge
N is usual; I don't work in such fields, myself (at least, not often).
>Of course, there is a certain bit of arbitrariness to tests
>of statistical significance. There is nothing magic about a
>p-value of 0.05 which is so often employed to declare significance.
>However, I do believe that there is less which is arbitrary in
>a test of statistical significance than there is in visual
>examination of probability plots.
>
I'd state this a little differently - I'd say that the arbitrariness
in a statistical test is precisely defined, whereas the arbitrariness
in visual inspection is much more murky.
Dale's final point was about COUNTREG, which I definitely need to explore
Interesting discussion!
Peter
Peter L. Flom, PhD
Statistical Consultant
www DOT peterflomconsulting DOT com
|