```Date: Thu, 1 Aug 2002 14:32:08 -0700 Reply-To: Dale McLerran Sender: "SAS(r) Discussion" From: Dale McLerran Subject: Zero-Inflated Poisson models Content-Type: text/plain; charset=us-ascii This one is for the statisticians. I was asked off-list for any direction on fitting a zero-inflated poisson (ZIP) model. I had addressed the topic in a post some time back, but in reviewing that post I felt there were a number of points which could be improved on. Specifically, I did not allow the mean model or the zero inflate probability to be functions of other variables in the data. Also, I used an identity link for the poisson mean rather than a log link. That could be computationally dangerous as well as inefficient. So, just for the record, I wanted to demonstrate fitting a ZIP model employing NLMIXED, allowing the mean and zero inflate probability to be functions of some other variables. I also specify a log link. Below I generate some data and then show code for fitting the ZIP model. /* ZIP model data generation */ data test; /* parameters for zero inflation probability */ bp_0 = -.5; bp_1 = 0.3; /* parameters for poisson mean */ bll_0 = .4; bll_1 = .2; seed = 1234579; do i=1 to 1000; /* generate variables affecting probability and mean eta's */ call ranbin(seed,1,.5,gender); call rannor(seed,x); /* probability of zero inflation a function of gender only */ eta_prob = bp_0 + bp_1*gender; p_0 = exp(eta_prob) / (1 + exp(eta_prob)); call ranbin(seed,1,p_0,zero_inflate); /* poisson mean a function of x only */ eta_lambda = bll_0 + bll_1*x; lambda = exp(eta_lambda); /* Generate response */ if zero_inflate then y=0; else call ranpoi(seed,lambda,y); output; end; keep y gender x; run; /* Now fit the ZIP model employing NLMIXED */ proc nlmixed data=test; parms bp_0=0 bp_1=0 bll_0=0 bll_1=0; eta_prob = bp_0 + bp_1*gender + null_2*x; p_0 = exp(eta_prob)/(1 + exp(eta_prob)); eta_lambda = bll_0 + bll_1*x + null_1*gender; lambda = exp(eta_lambda); /* likelihood structure for ZIP. Note that we cannot generate */ /* the log likelihood for the zero response values directly. We */ /* must generate the likelihood and then take the logarithm due */ /* to the additive nature of the zero likelihood. */ if y=0 then prob = p_0 + (1-p_0)*exp(-lambda); if y=0 then loglike = log(prob); else loglike = log(1-p_0) + y*log(lambda) -lambda - lgamma(y+1); /* fit the model */ model y ~ general(loglike); /* test whether the (1-alpha)*100 CI for the estimated parameters */ /* covers the true parameter values. */ contrast "bp_0=-.5" bp_0+.5; contrast "bp_1=.3" bp_1-.3; contrast "bll_0=.4" bll_0-.4; contrast "bll_1=.2" bll_1-.2; run; Note that the model which is fit is overparameterized in that there is in truth no effect of the variable x on the zero inflate probability and there is no effect of gender on the poisson mean. However, I model the eta's for both the probability and the mean as functions of both gender and x. The Parameter Estimates table tests these two effects against their true (0) values. There is no need to specify contrasts for these parameters. If you run the above simulation, you will observe that a 95% confidence interval covers the true parameter values for all of the estimates. Convergence occurs quite readily. Hope that this is of interest to some. Dale ===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 --------------------------------------- __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com ```

Back to: Top of message | Previous page | Main SAS-L page