Date: Mon, 26 Jan 1998 12:30:49 -0500
Reply-To: Alison Canchola <alisonc@ITSA.UCSF.EDU>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Alison Canchola <alisonc@ITSA.UCSF.EDU>
Subject: logistic regression with clustering
Content-Type: text/plain; charset="us-ascii"
I have a logistic regression problem where I need to take clustering into
account (each subject was seen more than once, and so has more than one
observation). I have been using the GEE macro and proc genmod for the
first time. Just as a check, I ran my model through both GEE and proc
genmod (as well as Stata), hoping to verify my results. GEE and Stata gave
similar results, but proc genmod did not.
My macro call and (abreviated) results from the GEE macro, version 2.03
were as follows:
%gee(data=cd4age2,
yvar=oclf,
xvar=one agevm t4,
id=studyid,
link=3,
vari=3)
;
run;
So, I am using binomial variance (vari=3), the logit link (link=3) and the
(default) constant scale parameter.
Estimate with model-based s.e., z-score and p-value:
Estimate SE-model z-model p-model
ONE -0.186732 0.589 -0.32 0.7514
AGEVM -0.027383 0.015 -1.77 0.0769
T4 -0.000211 0.000 -0.81 0.4161
Estimate with robust s.e., z-score and p-value
Estimate SE-Robust z-Robust p-Robust
ONE -0.186732 0.860 -0.22 0.8281
AGEVM -0.027383 0.022 -1.23 0.2195
T4 -0.000211 0.000 -0.61 0.5388
For proc genmod, the coding and (abreviated) results were:
proc genmod data=cd4age2;
class studyid;
model oclf = agevm t4 / dist=binomial;
repeated subject=studyid;
run;
Analysis Of Initial Parameter Estimates
Parameter DF Estimate Std Err ChiSquare Pr>Chi
INTERCEPT 1 -0.1867 0.5819 0.1030 0.7483
AGEVM 1 -0.0274 0.0153 3.2089 0.0732
T4 1 -0.0002 0.0003 0.6783 0.4102
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Empirical 95% Confidence Limits
Parameter Estimate Std Err Lower Upper Z Pr>|Z|
INTERCEPT -0.5756 0.6533 -1.8560 0.7048 -.8811 0.3783
AGEVM -0.0248 0.0140 -0.0522 0.0027 -1.768 0.0770
T4 0.0001 0.0003 -0.0004 0.0006 0.3326 0.7394
Scale 1.0832 . . . . .
In both GEE and Stata, the parameter estimates with the standard errors
adjusted for clustering were the same as the parameter estimates from
regular logistic regression (with unadjusted standard errors). Proc genmod
gives the same parameter estimates (as GEE) as the initial parameter
estimates, but then the parameter estimates change in the section "Analysis
of GEE parameter estimates, empirical standard error estimates."
Why do the parameter estimates change in proc genmod, and why does proc
genmod include a scale parameter? I also tried including the option
'noscale' in proc genmod, but got the same results.
Alison Canchola
University of California, San Francisco
Dept. of Stomatology
(415) 476-9169
alisonc@itsa.ucsf.edu