LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2008, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 31 Mar 2008 14:52:50 -0400
Reply-To:     Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:      Re: Rules for GLMSELECT
Comments: To: Peter Flom <peterflomconsulting@mindspring.com>
In-Reply-To:  <26144118.1206987953461.JavaMail.root@mswamui-thinleaf.atl.sa.earthlink.net>
Content-Type: text/plain; charset="us-ascii"

Peter: I meant that one should re-estimate whatever final model GLMSELECT specifies (after taking advantage of shrinkage of parameter estimates) using a standard regression procedure. With more observations GLMSELECT will have a larger set of alternative models to consider. I would have serious doubts about a model specification that selects predictors that seem no more likely to have a true association with the DV than any of the ones rejected. Blind selection of a few predictors risks selecting those that fit well to one sample and not to other samples.

Underloading predictors results biased parameter estimates (although perhaps with narrower confidence intervals than less biased estimates). Re-estimating a model using a standard regression program supports analysis of residuals that may uncover bias in predictions. S

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Peter Flom Sent: Monday, March 31, 2008 2:26 PM To: Sigurd Hermansen; SAS-L@LISTSERV.UGA.EDU Subject: Re: Rules for GLMSELECT

I have to disagree with Sig, here.

I think the point of GLMSELECT, or at least a large part of the point, is that it penalizes you for having too few observations for your number of variables by selecting a simple model. For continuous DV, the runs that David Cassell and I did for our paper show that if you greatly overload your model with variables, then STEPWISE in its various flavors will mess up, but GLMSELECT will not.

Peter

-----Original Message----- >From: Sigurd Hermansen <HERMANS1@WESTAT.COM> >Sent: Mar 31, 2008 1:40 PM >To: SAS-L@LISTSERV.UGA.EDU >Subject: Re: Rules for GLMSELECT > >Martin: >Since GLMSELECT should be used only for exploratory modelling, and >whatever model you select should be estimated using PROC LOGISTIC, PROC

>GENMOD, PROC MIXED, or another regression procedure, the same rules >should apply whether or not you are using PROC GLMSELECT to help >specify a model. I do think that automated exploration of predictive >model specifications would require substantially more observations than

>specification of a model for the purpose of testing a specific >hypothesis. S > > > >-----Original Message----- >From: owner-sas-l@listserv.uga.edu >[mailto:owner-sas-l@listserv.uga.edu] >On Behalf Of martholt >Sent: Monday, March 31, 2008 12:56 PM >To: sas-l@uga.edu >Subject: Rules for GLMSELECT > > >When using logistic regression, the general advice is to have no fewer >than 10 cases per variable. Do any such rules exist for GLMSELECT, or >could you please point me to a document that discusses this. > >Thank you, > >Martin Holt

Statistical Consultant www DOT peterflom DOT com


Back to: Top of message | Previous page | Main SAS-L page