LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2005, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 3 Oct 2005 13:47:15 -0400
Reply-To:     Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:      Re: complete separation in logistic regression
Comments: To: Robin High <>
Content-Type: text/plain; charset="us-ascii"

Robin: Thanks for the suggestions. Matthew Zack also suggested testing how the FL Macro compares with PROC LOGISTIC exact.

I asked about the FL Macro because it accommodates continuous covariates as predictors. I'll take a look at how it works with the valuable test data that you supplied. I also have a reduced version of the model that includes binary predictors only (although it generates enough misclassifications to fall outside the 'almost complete separation' class of models). I'll try the exact logistic procedure on it.

While the FL Macro reduces parameter estimates to something closer to those in a typical model, its predicted values of the outcome have a higher misclassification rate for test data than the model that suffers, when applied to my data, from complete separation. I don't know how to interpret that finding.

I appreciate your help with this statistical modelling problem. Glad to know that someone else has looked at the results it produces. Sig

-----Original Message----- From: [] On Behalf Of Robin High Sent: Monday, October 03, 2005 11:56 AM To: Sigurd Hermansen Cc: SAS-L@LISTSERV.UGA.EDU Subject: Re: complete separation in logistic regression

> Anyone have any experience with logistic regression under conditions > of complete separation. Heinze and Schemper have publications and SAS > Macro available at

> > > I'd have a special interest in hearing from anyone who has used the > algorithm and would know something about its performance > characteristics. Would also like to hear whether SAS statistical > PROC's handle complete or almost complete separation, or if someone > has adapted NLMIXED or other procedures to the problem. >

Hi Sig,

One way to examine how it works is to run a few test programs comparing it to the output from PROC LOGISTIC with the EXACT statement as shown in the example below. It is important to recognize the input data coding scheme to get equivalent, thus the reason for the "descending" option and the reference category treatment for time in the CLASS statement of LOGISTIC.

The estimated odds ratios tend to vary widely when there is a '0' in one of the cells. They are much closer when at least 1 observation is present in every cell (just add a new observation 1 0 1 to the test data).

The Firth macro merges files without BY statements, so be aware of that if you have invoked:

OPTIONS mergeNoBy = error ;

I applied this procedure recently to a set of data with complete separation and found helpful results. However, with a confidence interval containing an upper bound that is close to "infinity" on the odds ratio, not sure what, if any, technique would produce anything better.

I don't believe NLMIXED would be relevant in this case, though I continually learn new things about it and am continually amazed at what it can do.

Robin High Univ. of Oregon

TITLE1 'Compare Exact and Firth Logistic Regression';

Data one; Input group time rsp @@; Cards; 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 ;

PROC TABULATE data=one NOseps ; class time rsp; table time, (rsp all='Tot')*n=' '*f=5.0 / rts=10 misstext='0'; TITLE2 'Data Summary'; run;

/* ---------------------------- | | rsp | | | |-----------| | | | 0 | 1 | Tot | |--------+-----+-----+-----| |time | | | | |0 | 10| 0| 10| |1 | 3| 7| 10| ---------------------------



proc logistic data=one order=data descending; CLASS time(ref=last) / param=ref; MODEL rsp = time / risklimits expb ; EXACT time / estimate=both; TITLE1 'LOGISTIC: Compare MLE and Exact Calculations'; run;

* read in the firth macro;

%INCLUDE 'c:\sas\logistic\';

* There are other options to choose from when you call it / The macro assumes the binomial response is dummy coded (rsp=0/1) and that classification data (e.g., gender) if present, are 'dummy' coded as well. Apply PROC GLMMOD first if you need to recode data ;

%fl(data=one, y=rsp, varlist= time , maxit=50, epsilon=0.0001, noint=0, outest=_est, print=1, pl=1, plint=0, alpha=0.05, odds=1, notes=0, standard=1);

Back to: Top of message | Previous page | Main SAS-L page