Date: Thu, 1 Oct 2009 18:32:14 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Model Comparison using AIC
In-Reply-To: <81F8139F381BE844AE05CA6525FF2AAE129B60@tpwd-mx9.tpwd.state.tx.us>
Content-Type: text/plain; charset="us-ascii"
As often happens, a basic idea in one field has a long history and rich literature in another field. The idea of model averaging has a literature in machine learning, data mining, and predictive analytics that dates back a couple of decades at least. See
<<http://machine-learning.martinsewell.com/ensembles/ensemble-learning.pdf>>
For a very extensive summary.
An explanatory model should, I'd say, fit well to outcomes within a sample. Assuming only one sample, a less complex model with an only slightly worse AIC may help focus an analysis on a critical question.
A predictive model should at least predict accurately in a sample and in any other samples that may be available (including samples which have some outcomes unknown during the time of model development). For that, the ROC AUC, perhaps with a loss function, would seem a better measure of predictive accuracy. The many "model fusion" methods tend to improve predictive accuracy, though much debate centers on why and under what conditions.
I wouldn't rely solely on any one measure of predictive accuracy, or of model fit for that matter. Averaging to improve a single accuracy or fit criterion leads us into the same kind of trap as the infamous step-wise method(s).
S
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Warren Schlechte
Sent: Thursday, October 01, 2009 1:21 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Model Comparison using AIC
I agree that usually, we consider model averaging over a single sample. But the poster's question appeared to suggest a cross-validation of the model. I wondered if combining the two might lead to more robust parameter estimation.
It was just a thought.
Warren
-----Original Message-----
From: Peter Flom [mailto:peterflomconsulting@mindspring.com]
Sent: Thu 10/1/2009 11:45 AM
To: Warren Schlechte; SAS-L@LISTSERV.UGA.EDU
Subject: Re: Model Comparison using AIC
Warren Schlechte <Warren.Schlechte@TPWD.STATE.TX.US> wrote
>Using different samples suggest a cross-validation approach. Modeling
>averaging seems what is desired, and based on my reading, Burnham and
>Anderson (1998) use an Akaike weighting within the model averaging to
>get estimates of parameters.
>
>So, what I guess I'm saying is maybe the AIC can be used within a model
>averaging realm, which seems to be what is going on here.
>
>Obviously, I await responses from Dale and others.
>
Burnham and Anderson average various models on the same sample, not various samples
on the same model.
If you have several samples, then why not combine them to get greater power? You could, if
desired, include SAMPLE as covariate.
Peter
Peter L. Flom, PhD
Statistical Consultant
Website: www DOT peterflomconsulting DOT com
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom