Date: Fri, 2 Sep 2005 09:12:43 -0400
Reply-To: Peter Flom <flom@NDRI.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Flom <flom@NDRI.ORG>
Subject: Re: On Hosmer-Lemeshow, etc. and Model Selection
Content-Type: text/plain; charset=US-ASCII
Bora et al.
I've snipped and answered what I can......lots of e-mails on this. I
can't wait to see what the
stats-gurus make of this (I know David and Dale are both on the West
Me, I am no guru.
[BY: Yes, we categorise continuos variables too. What are the
Increased type 2 error, sometimes also increased type 1 error, less
Think of it this way. Suppose you are trying to predict heart attacks.
One of your IVs is going to be age.
If you categorize it, into, say
< 18, 18-25, 26-35.......75 +
then you are saying that the risk of heart attack for a 55 year old is
the same as for a 64 year old, but that this risk changes at age 65, and
then stays constant to age 74.......
SOMETIMES categorizing makes sense - but rarely.
[BY: I've used "bagging" (bootstrapp averagging) at times and made use
information criteria (AIC, etc.) on model selection. But, apparently,
there does not exist a coherent methodology for selecting the "best"
and a wide raneg of conflicting practices and approaches exist.]
True. But, from what I can see, they are debating over the minutiae,
and any of the approaches
may yield some valuable insight.
Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)