LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2003, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 15 Dec 2003 12:18:36 -0800
Reply-To:     Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:      Re: NLMIXED model unstable?
In-Reply-To:  <Pine.GSO.4.10.10312151132390.26073-100000@nber5.nber.org>
Content-Type: text/plain; charset=us-ascii

Susan,

Your model below has 80 parameters (including intercept)! I am not at all surprised that there is instability when you add more terms to your model. You may very well end up creating some terms which are highly linearly correlated. 80 parameters! How will you ever interpret such a model? How will you ever explain such a model in any peer reviewed publication. You will use up all allowed journal space just identifying the variables which you are employing.

I subscribe to the belief that a model is of absolutely no value unless it is parsimonious. Perhaps if one were modelling some chemical behavior which can be observed under very well controlled conditions, then a model with 80 parameters might be interpretable. I don't know what your response is (or, for that matter, what a lot of your predictor variables are), but from the predictor variables which I can interpret, you are modelling some medical condition or some behavior given respondent characteristics. Neither of these has measurements that have the high degree of accuracy needed to support a large, complex model. Under the circumstances, I would be looking for a small model which explains the bulk of the response.

Dale

--- Susan Stewart <sstewart@NBER.ORG> wrote: > We have a complicated ordered probit model that we are running using > the > PROC NLMIXED code that you generously provided to us in October. > However, > we are concerned about the stability of the model. > > Our model includes 21 interaction terms between sets of variables, > for > example we estimate a single b for the interaction between 6 > variables > representing mental health and 16 variables representing physical > function > (these include both main effects and interaction terms). This is > something > we came up with to avoid having to test hundreds of smaller > interactions. > Do you think it is OK to do this? > > Here is an excerpt of that code (the full model is farther below): > > b178* ((b4*sfcn1 + b5*pac1n1 + b6*pac41 + b7*pac51 + b8*pac61 + > b9*pac14 + > b10*pac15 + b11*pac16 + b12*drvbus + b14*pac45 + b15*pac46 + > b16*pac56 + > b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + b21*dbpac61) * (b40*depres1 > +b41*anxious1 +b42*sleep1 +b43*depanx1 +b44*depslp1 +b45*anxsl1)) > > (All of these are 0/1 variables. Our sample size is 1420, so we > should > have plenty of DF. The sample sizes for all terms are reasonable; the > lowest are above 75 and most are much higher.) > > The version of our model that we originally wanted includes 58 main > effects and interactions, and 21 interaction terms between sets of > variables. It runs fairly well, with reasonably low standard errors > (above > 1.0 for some terms but not above 3.0). However, if one or more other > such > terms are added or removed, the model often crashes, with very high > SEs > for most terms (higher than 400 for some). > > As another way of testing the stability of the original model, we > tried > specifying different start values, for example half to 4 and half to > -4. > The results are almost the same when we do this, as long as I specify > a > starting value for the intercept as well (for example 6). If I don't > specify a starting value for the intercept, the model runs terribly, > with > no SE calculated for most of the terms involving pain. > > When I run it with starting values of 10 and -10, it seems to > automatically switch to a different convergence criterion (FCONV > rather > than GCONV) and give crazy output, with SEs of zero for about half of > the > terms, only 150 iterations, and a -2LL of 66590 (The original model > with > GCONV did 244 iterations and had -2LL of 3073). > > The model also yields slightly different results depending on how the > data is sorted, but hopefully this is just rounding error. > > We realize that this is a rather large question but appreciate any > bits of > advice you might have on the validity and stability of our model, > possible > explanations for our strange output, and other ways of testing > whether or > not we have a good model. > > Thank you, > Susan > > Here is our code. Let me know if sample output or any other > information would also be helpful. > > > proc sort; by record; > proc nlmixed maxiter = 1000; > > parms > > a1=1 a2=1 a3=1 > > b0 = 6 > b200=4 b201=-4 b202=4 b203=-4 b204=4 b205=-4 b206=4 b207=-4 b208=4 > b209=-4 > b210=4 b211=-4 b212=4 b213=-4 b214=4 > > b1=4 b2=-4 b3=4 b4=-4 b5=4 b6=-4 b7=4 b8=-4 b9=4 b10=-4 b11=4 b12=-4 > b14=4 > b15=-4 b16=4 b18=-4 b19=4 b20=-4 b21=4 b22=-4 b23=4 b24=-4 b25=4 > b26=-4 > b35=4 b36=-4 b37=4 b40=-4 b41=4 b42=-4 b43=4 b44=-4 b45=4 b50=-4 > b60=4 > b61=-4 b62=4 b63=-4 b64=4 b65=-4 b70=4 b71=-4 b72=4 > > b170=-4 b171=4 b172=-4 b173=4 b174=-4 b175=4 b176=-4 b177=4 b178=-4 > b179=4 > b180=-4 b181=4 b182=-4 b183=4 b184=-4 b185=4 b186=-4 b187=4 b188=-4 > b189=4 > b190=-4 > > /* > b0=10 > b200=10 b201=-10 b202=10 b203=-10 b204=10 b205=-10 b206=10 b207=-10 > b208=10 b209=-10 > b210=10 b211=-10 b212=10 b213=-10 b214=10 > > b1=10 b2=-10 b3=10 b4=-10 b5=10 b6=-10 b7=10 b8=-10 b9=10 b10=-10 > b11=10 > b12=-10 b14=10 > b15=-10 b16=10 b18=-10 b19=10 b20=-10 b21=10 b22=-10 b23=10 b24=-10 > b25=10 > b26=-10 > b35=10 b36=-10 b37=10 b40=-10 b41=10 b42=-10 b43=10 b44=-10 b45=10 > b50=-10 > b60=10 > b61=-10 b62=10 b63=-10 b64=10 b65=-10 b70=10 b71=-10 b72=10 > > b170=-10 b171=10 b172=-10 b173=10 b174=-10 b175=10 b176=-10 b177=10 > b178=-10 b179=10 > b180=-10 b181=10 b182=-10 b183=10 b184=-10 b185=10 b186=-10 b187=10 > b188=-10 b189=10 > b190=-10 > */ > ; > > array eta {4}; > eta1 = > -(b0 + b200*age50_54 + b201*age55_59 + b202*age60_64 + b203*age65_69 > + > b204*age70_74 + b205*age75_79 + b206*age80up + b207*sex + > b208*sex5054 + > b209*sex5559 + b210*sex6064 + b211*sex6569 + b212*sex7074 + > b213*sex7579 + > b214*sex80up > > + b1*sac0or11 + b2*sac21 + b3*s0or1_21 > > + b4*sfcn1 + b5*pac1n1 + b6*pac41 + b7*pac51 + b8*pac61 + b9*pac14 + > b10*pac15 + b11*pac16 + b12*drvbus + b14*pac45 + b15*pac46 + > b16*pac56 + b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + b21*dbpac61 > > +b22*pain1 +b23*rash1 +b24*ubsex1 +b25*painus1 +b26*painrh1 > > +b35*sick1 +b36*cough1 +b37*sickcgh1 > > +b40*depres1 +b41*anxious1 +b42*sleep1 +b43*depanx1 +b44*depslp1 > +b45*anxsl1 > > +b50*tired1 > > +b60*sexprob1 +b61*ent1 +b62*appear1 +b63*meddiet1 +b64*limb1 > +b65*hedach1 > > +b70*eyepain1 +b71*glasses1 +b72*talk1 > > + b170*((b1*sac0or11 + b2*sac21 + b3*s0or1_21)* (b4*sfcn1 + b5*pac1n1 > + > b6*pac41 + b7*pac51 + b8*pac61 + b9*pac14 + b10*pac15 + b11*pac16 + > b12*drvbus + b14*pac45 + b15*pac46 + b16*pac56 > + b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + b21*dbpac61)) > > + b171*((b1*sac0or11 + b2*sac21 + b3*s0or1_21)*(b22*pain1 > +b23*rash1 +b24*ubsex1 +b25*painus1 +b26*painrh1)) > > + b172*((b1*sac0or11 + b2*sac21 + b3*s0or1_21)* > (b35*sick1 +b36*cough1 +b37*sickcgh1)) > > + b173*((b1*sac0or11 + b2*sac21 + b3*s0or1_21)* (b40*depres1 > +b41*anxious1 +b42*sleep1 +b43*depanx1 +b44*depslp1 +b45*anxsl1)) > > + b174*((b1*sac0or11 + b2*sac21 + b3*s0or1_21)* tired1) > > + b175*((b1*sac0or11 + b2*sac21 + b3*s0or1_21)* > (b70*eyepain1 +b71*glasses1 +b72*talk1)) > > + b176* ((b4*sfcn1 + b5*pac1n1 + b6*pac41 + b7*pac51 + b8*pac61 + > b9*pac14 > + b10*pac15 + b11*pac16 + b12*drvbus + b14*pac45 + b15*pac46 > + b16*pac56 + b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + > b21*dbpac61) * (b22*pain1 +b23*rash1 +b24*ubsex1 +b25*painus1 > +b26*painrh1)) > > + b177* ((b4*sfcn1 + b5*pac1n1 + b6*pac41 + b7*pac51 + b8*pac61 + > b9*pac14 > + b10*pac15 + b11*pac16 + b12*drvbus + b14*pac45 + b15*pac46 > + b16*pac56 + b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + > b21*dbpac61) * > (b35*sick1 +b36*cough1 +b37*sickcgh1)) > > + b178* ((b4*sfcn1 + b5*pac1n1 + b6*pac41 + b7*pac51 + b8*pac61 + > b9*pac14 > + b10*pac15 + b11*pac16 + b12*drvbus + b14*pac45 + b15*pac46 > + b16*pac56 + b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + > b21*dbpac61) * > (b40*depres1 +b41*anxious1 +b42*sleep1 +b43*depanx1 +b44*depslp1 > +b45*anxsl1)) > > + b179* ((b4*sfcn1 + b5*pac1n1 + b6*pac41 + b7*pac51 + b8*pac61 + > b9*pac14 > + b10*pac15 + b11*pac16 + b12*drvbus + b14*pac45 + b15*pac46 > + b16*pac56 + b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + > b21*dbpac61) * > tired1) > > + b180* ((b4*sfcn1 + b5*pac1n1 + b6*pac41 + b7*pac51 + b8*pac61 + > b9*pac14 > + b10*pac15 + b11*pac16 + b12*drvbus + b14*pac45 + b15*pac46 > + b16*pac56 + b18*dbpac11 + b19*dbpac41 + b20*dbpac51 + > b21*dbpac61) * > (b70*eyepain1 +b71*glasses1 +b72*talk1)) > > +b181* ((b22*pain1 +b23*rash1 +b24*ubsex1 +b25*painus1 +b26*painrh1) > * (b35*sick1 +b36*cough1 +b37*sickcgh1)) > > +b182* ((b22*pain1 +b23*rash1 +b24*ubsex1 +b25*painus1 > +b26*painrh1)* > (b40*depres1 +b41*anxious1 +b42*sleep1 +b43*depanx1 +b44*depslp1 > +b45*anxsl1)) > > +b183* ((b22*pain1 +b23*rash1 +b24*ubsex1 +b25*painus1 > +b26*painrh1)* > tired1) > > +b184* ((b22*pain1 +b23*rash1 +b24*ubsex1 +b25*painus1 +b26*painrh1)* > (b70*eyepain1 +b71*glasses1 +b72*talk1)) > > +b185 *((b35*sick1 +b36*cough1 +b37*sickcgh1) > * (b40*depres1 +b41*anxious1 +b42*sleep1 +b43*depanx1 > +b44*depslp1 +b45*anxsl1)) > > +b186 *((b35*sick1 +b36*cough1 +b37*sickcgh1) * tired1) > > +b187 *((b35*sick1 +b36*cough1 +b37*sickcgh1) * > (b70*eyepain1 +b71*glasses1 +b72*talk1)) > > +b188* ((b40*depres1 +b41*anxious1 +b42*sleep1 +b43*depanx1 > +b44*depslp1 +b45*anxsl1) * tired1) > > +b189* ((b40*depres1 +b41*anxious1 +b42*sleep1 +b43*depanx1 > +b44*depslp1 +b45*anxsl1) * > (b70*eyepain1 +b71*glasses1 +b72*talk1)) > > +b190* ((b70*eyepain1 +b71*glasses1 +b72*talk1) * tired1)); > > eta2 = a1 + eta1; > eta3 = a2 + eta2; > eta4 = a3 + eta3; > > if EVGFP=1 then do; > p_YLEy = cdf('Normal', eta1); > p_YLEym1 = 0; > end; else > if EVGFP=5 then do; > p_YLEy = 1; > p_YLEym1 = cdf('Normal', eta4); > end; > else do; > p_YLEy = cdf('Normal', eta{EVGFP} ); > p_YLEym1 = cdf('Normal', eta{EVGFP-1}); > end; > p_YEQy = max(p_YLEy - p_YLEym1, 1E-12); > > loglike = log(p_YEQy); > > model EVGFP ~ general(loglike); > > predict eta1 OUT=Q2 DER; run; >

===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________ Do you Yahoo!? New Yahoo! Photos - easier uploading and sharing. http://photos.yahoo.com/


Back to: Top of message | Previous page | Main SAS-L page