LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 9 Jan 2008 09:33:28 -0800
Reply-To:     Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:      Re: AIC mystery in MIXED
In-Reply-To:  <200801091554.m09Bl6BV027598@malibu.cc.uga.edu>
Content-Type: text/plain; charset=iso-8859-1

--- Ryan Utz <rutz@AL.UMCES.EDU> wrote:

> Hi all, > > I'm having issues using/interpreting AIC scores in proc MIXED. I'm > trying > to compare simple linear relationships with power function > relationships > (both models have been shown to be consistently valid in related > datasets). > When I go to interpret AIC (or AICc, etc) scores, however, power > relationships always emerge as the better model, even when it clearly > isn't > the case. As an example, I provided my actual data for an extremely > simple > model at the bottom of this email (I'm testing much more complex > models, but > the example below illustrates the problem). To test the power > relationship, > I've log-transformed both X and Y. Running the code below shows that > MIXED > suggests the power relationship is better (it has a lower AIC score), > but if > you run a simple linear regression, clearly the non-transformed data > (thus a > linear relationship) is superior. This is true even when both models > have > the exact same number of parameters. > > Is there something I'm doing wrong here, either in execution or > interpretation? I'd like to use AIC scores to help choose a model, > but > because of this issue I'm vary hesitant. > > Thanks ahead of time for any advice, > > Ryan Utz > University of Maryland Center for Environmental Science > > > data test; > input density length; cards; > 0.099266504 82.8125 > 0.048193642 85.05405405 > 0.114893617 84.34210526 > 0.257685811 70.515625 > 0.044660194 86.92857143 > 0.244736842 76.37647059 > 0.020619946 89.5 > 0.058555133 93.6 > 0.125817923 84.08888889 > > data test2; set test; > lndensity = log(density); > lnlength= log (length); run; > > title Linear Relationship; > proc mixed data=test2; > model length=density; run; > > title Power Relationship; > proc mixed data=test2; > model lnlength=lndensity; run; > > /*Simple regression for comparison*/ > > Title Linear relationship-simple regression; > proc glm data=test2; > model length=density; run; > > Title Linear relationship-Power function; > proc glm data=test2; > model lnlength=lndensity; run; >

Ryan,

A couple of comments. AIC can be used to compare models only when the same response is employed. In these data, you are using first length and then lnlength as response variables. You just cannot use a likelihood-based statistic to compare across different response variables.

Now, a couple of other notes about the use of AIC to compare models. First, the use of any likelihood-based statistic is restricted to comparisons where exactly the same response VALUES are employed in all models. This means not that not only must the response variable be the same for all models but also that there are no missing values for any predictor variables which affect the number of observations employed to fit different models. This has no bearing on the data that you presented, but I think it is important to make clear what the requirements are for comparing models employing an AIC criterion.

Second, when AIC is employed to assess which model fixed effects are most reasonable, then maximum likelihood estimation must be employed rather than restricted maximum likelihood estimation. Thus, if you had lnlength as your response and you wanted to assess whether density or lndensity was the better predictor, then you would want to fit the two models as shown below:

proc mixed data=test2 METHOD=ML; model lnlength=lndensity; run;

proc mixed data=test2 METHOD=ML; model lnlength=lndensity; run;

With respect to your comment that the model employing "non-transformed data (thus a linear relationship) is superior", I certainly cannot tell that from the data that you showed. For either transformation, the same two observations present as model outliers. Such a claim might be supported in a larger data set. And it may be that the non-transformed data is slightly better here, but not so much better as to be readily apparent.

Dale

--------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


Back to: Top of message | Previous page | Main SAS-L page