| Date: | Fri, 25 Jul 2003 14:31:58 +1000 |
| Reply-To: | paulandpen@optusnet.com.au |
| Sender: | "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU> |
| From: | Paul Dickson <paulandpen@optusnet.com.au> |
| Subject: | Re: What does "over-fitting" mean? |
|
| Content-Type: | text/plain |
Hi there Christina,
I agree with the comments made (and yes they were a little harsh in my opinion
but unfortunately true for stats oriented journals and their readers). Readers with
enough stats knowledge could hammer you for using this type of analysis with the
sample size, provided they have time and could be bothered-I would not but
some others out there might). I would therefore recommend deferring to
descriptives and to revise the journal publication possibilities aiming for
practitioner/non-stats based journals and go from there.
I would also look at other previous published research in the same area that has
used any of the same variables that you have and compare your findings on a
variable by variable basis (you don't have the sample size to develop a
multivariate model from your data) and use tables (means, medians, proportions
etc) of yours and other peoples results to comment on the findings
(similarities/differences/possible reasons why they are different and similar and
how this could contribute to the field as a whole- recommendations of follow up
and future funding should also be factored in here). This broad information could
help you provide useful clinical/theoretical information and make comparisons
across studies that while not statistically meaningful are still incredibly meaningful
to practitioners and to the theory in a broad sense. Also see if there are norms
etc for your data and look to national data (incident/prevalence) that you could
use to contextualise your findings a little more.
Finally, I am convinced there is plenty of crap published out there that makes little
contribution to any real theory or clinical/practical meaning (no effect size
reported) simply because the sample size used in the analysis was so big that
even tiny differences were picked up statistically (if only we all had data sets like
this). All significant stats really do at the end of the day is substantiate the
analysis statistically (according to commonly agreed and valid rules of thumb-
which vary so much that sometimes they are confusing) and it says nothing
broader about the meaning of the results from a theoretical and clinical point of
view. That final comment may also be a bit harsh but I wander how "true" it is!!!!).
Finally, I hope one day we find ways to model our important groupingss of
variables on very very small sample sizes and still produce meaningful results!!!!!
Cheers Paul
> Christina Cutshaw <ccutsha1@jhem.jhmi.edu> wrote:
>
> Steve,
>
> Thank you for your comments. I will evaluate my options in light of
> your
> suggestions.
>
> Best,
>
> Chris
>
> "Simon, Steve, PhD" wrote:
>
> > Chris Cutshaw writes:
> >
> > > I am conducting binary logistic regression analyses with a
> > > sample size of 73 of which 22 have the outcome of interest (e.g.
> are
> > > "very successful" versus somewhat/not very successful). I have
> > > fourteen variables of interest which I examined in a univariate
> > > logistic regression with the dependent variable. Of these
> > > fourteen, six have a liklihood-ratio chi-square of p<0.25. Hosmer
> &
> > > Lemeshow suggest that all variables with a p<0.25 be examined in
> the
> > > multivariable modeling. I have heard that there should be about
> 10
> > cases
> > > with the outcome of intertest per independent variable to avoid
> > > "overfitting."
> > >
> > > 1) Does this mean my final model should contain no more than 2
> > > variables? 2) Can I can look at all six variables using a forward
> > > stepwise procedure for example, as long as the final model has
> only
> > > two or three variables? Or should I create several different
> > > two or three-variable models and see which combinations yield
> > > significant results and compare them in some way?
> > >
> > > What does "overfitting" actually mean?
> >
> > I apologize if some of the comments here appear harsh. You are
> going to
> > have to seriously lower your expectations. That may be
> disheartening,
> > but better to face the bad news now rather than later.
> >
> > Overfitting means that some of the relationships that appear
> > statistically significant are actually just noise. You will find
> that a
> > model with overfitting does not replicate well and does a lousy job
> of
> > predicting future responses.
> >
> > The rule of 10 observations per variable (I've also heard 15) is
> > referring to the number of variables screened, not the number in
> the
> > final model. Since you looked at 14 variables, you really needed
> 140 to
> > 210 events of interest (equivalent to 464 to 697 total
> observations) to
> > be sure that your model is not overfitting the data.
> >
> > What to do, what to do?
> >
> > If you are trying to publish these results, you have to hope that
> the
> > reviewers are all asleep at the switch. Instead of a ratio of 10 or
> 15
> > to one, your ratio is 1.6 to one. All 14 variables are part of the
> > initial screen, so you can't say that you only looked at six
> variables.
> >
> > Of course, you were unfortunate enough to have the IRB asleep at
> the
> > switch, because they should never have approved such an ambitious
> data
> > analysis on such a skimpy data set. So maybe the reviewers will be
> the
> > same way.
> >
> > I wouldn't count on it, though. If you want to improve your chances
> of
> > publishing the results, there are several things you can do.
> >
> > First, I realize that the answer is almost always "NO" but I still
> have
> > to ask--is there any possibility that you could collect more data?
> In
> > theory, collecting more data after the study has ended is a
> protocol
> > deviation (be sure to tell your IRB). And there is some possibility
> of
> > temporal trends that might interfere with your logistic model. But
> both
> > of these "sins" are less serious than overfitting your data.
> >
> > Second, you could slap the "exploratory" label on your research.
> Put in
> > a lot of qualifiers like "Although these results are intriguing,
> the
> > small sample size means that these results may not replicate well
> with a
> > larger data set." This is a cop-out in my opinion. I've fallen back
> on
> > this when I've seen ratios of four to one or three to one, but you
> don't
> > even come close to those ratios.
> >
> > Third, ask a colleague who has not looked at the data to help. Show
> > him/her the list of 14 independent variables and ask which two
> should be
> > the highest priority, based on biological mechanisms, knowledge of
> > previous research, intuition, etc., BUT NOT LOOKING AT THE EXISTING
> > DATA. Then do a serious logistic regression model with those two
> > variables, and treat the other twelve variables in a purely
> exploratory
> > mode.
> >
> > Fourth, admit to yourself that you are trying to squeeze blood from
> a
> > turnip. A sample of 73 with only 22 events of interest is just not
> big
> > enough to allow for a decent multivariable logistic regression
> model.
> > You can't look for the effect of A, adjusted for B, C, and D, so
> don't
> > even try. Report each individual univariate logistic regression
> model
> > and leave it at that.
> >
> > Fifth (and most radical of all), give up all thoughts of logistic
> > regression and p-values altogether. Who made a rule that says that
> every
> > research publication has to have p-values? Submit a publication
> with a
> > graphical summary of your data. Boxplots and/or bar charts would
> work
> > very nicely here. Explain that your data set is too small to
> entertain
> > any serious logistic regression models. If you're unlucky, then the
> > reviewers may ask you to put in some p-values anyway. Then you
> could
> > switch to the previous option.
> >
> > Sixth, there are some newer approaches to statistical modeling that
> are
> > less prone to overfitting. Perhaps the one you are most likely to
> see if
> > CART (Classification and Regression Trees). These models can't
> make a
> > silk purse out of a sow's ear, but they do have some cross
> validation
> > checks that make them slightly better than stepwise approaches.
> >
> > If you asked people on this list how many of them have published
> results
> > when they knew that the sample size was way too small, almost every
> hand
> > would go up, I suspect. I've done it more times than I want to
> admit.
> > Just be sure to scale back your expectations, limit the complexity
> of
> > any models, and be honest about the limitations of your sample
> size.
> >
> > Good luck!
> >
> > Steve Simon, ssimon@cmh.edu, Standard Disclaimer.
> > The STATS web page has moved to
> > http://www.childrens-mercy.org/stats.
> >
> > P.S. I've adapted this question for one of my web pages. Take a
> look at
> >
> > http://www.childrens-mercy.org/stats/model/overfit.asp
|