LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2010, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 23 Mar 2010 08:54:13 -0700
Reply-To:     Shawn Haskell <shawn.haskell@STATE.VT.US>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Shawn Haskell <shawn.haskell@STATE.VT.US>
Organization: http://groups.google.com
Subject:      Re: Model selection in Proc Mixed
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset=ISO-8859-1

On Mar 22, 1:45 pm, wyldsoul <wylds...@gmail.com> wrote: > Hello, > I have a dataset and a set of apriori models and I am going to use > model selection and AIC to rank the models. My models have fixed and > random effects. I have two random class variables, year and unit, and > a suite of continuous variables. Below is a simplified sample > dataset. One thing I have to consider is that some, but not all > experimental units were sampled each year. > From research with SAS so far, I have found that the default > estimator used in proc mixed is REML, and that REML only considers the > random effects. Since the formula that calculates each AIC value > includes a bias correction term based on the number of parameters, it > seems that the REML method would be inappropriate for models including > fixed effects. In order to consider the fixed effects, I need to > specify the ML method. I have found that the ML method counts each > unique observation in a class variable as a separate parameter. For > example each year is counted as a separate parameter in the model. > This would seem to inflate the bias correction term for AIC, as it > uses the number of parameters for the calculation. I would welcome > any suggestions on the best way to proceed with this analysis. I am > wondering whether or not SAS is the best environment to perform model > selection, and I plan on calculating AIC values manually as a check. > Any recommendations or insight on how best to proceed with this > analysis are welcome. > > Thanks > > y year unit x3 x4 x5 x6 > 43 2005 A 23 37 19 7 > 34 2005 B 14 48 28 31 > 50 2005 C 19 24 48 48 > 4 2005 D 47 9 46 20 > 28 2005 E 37 36 6 12 > 7 2005 F 9 27 22 19 > 40 2005 G 31 9 15 32 > 45 2006 A 17 4 29 6 > 24 2006 C 29 23 7 38 > 37 2006 D 9 26 34 32 > 18 2006 F 11 45 50 18 > 18 2006 G 27 10 16 42 > 17 2007 B 6 34 7 29 > 49 2007 C 14 2 17 26 > 27 2007 D 12 13 31 46 > 18 2007 E 4 22 46 44 > 28 2007 F 50 45 5 16 > 5 2007 G 47 23 16 16 > 22 2007 H 29 5 29 36 > 40 2007 I 9 45 15 32

I'm no expert here on Proc MIXED, but it seems like you are taking the right approach. Yes, each level of a class variable, minus one, should be used as a parameter (K) to estimate AIC (-2LL + 2K) from your ML or LL output, and then AICc that has a further correction to prevent overfitting models to data. I think you should calculate your own AICc values in Excel or whatever other program you use - don't just trust SAS to give you what you think you are getting.

I recall that the bigger issue i had with Proc MIXED (or PHREG) was with the estimate of sample size used for calculating AIC. At least with family-group data in PHREG, i recall that I was not satisfied with what SAS estimated as a sample size - I thought it was too liberal - given those semiparametric and partial-likelihood models, I used a conservative estimate of sample size as the number of mortality events. Maybe my memory fails me for Proc MIXED - can someone epxlain how sample size is calcualted in Proc MIXED? Is it adequately parsimonius? i think I used the number of individual animals as an estimate of sample size for calcuating AICc from ML (or LL) given by Proc MIXED. thanks. Shawn


Back to: Top of message | Previous page | Main SAS-L page