```Date: Tue, 2 Nov 2010 14:42:33 -0400 Reply-To: peterflomconsulting@mindspring.com Sender: "SAS(r) Discussion" From: Peter Flom Subject: Re: huge (>999.99) odds ratios: cause? Comments: To: Jordan H In-Reply-To: Content-Type: text/plain; charset="us-ascii" Jordan H wrote <<< Hello, all. First, a little background. I've been asked to help with a project in which the goal to develop a model that predicts high cost pharmacy expenditures based on a variety of variables, such co-morbidities, demographics, etc. To do this, a multivariate regression model was used. My client is also interested in trying to model poor prediction within the multiple regression model. To do this, they saved the residuals from PROC REG, made an indicator variable for those observations with residuals greater than 1.75, and ran a PROC LOGISTIC with the new indicator variable as the response variable and the original independent variables, plus additional cost variables, as predictors. The model converges and most coefficients/odds ratios look reasonable but some appear to be errors (odds ratios of >999.99, confidence intervals (<0.001 - >999.99). We've checked things like multicollinearity but that doesn't seem to be an issue. Does anyone have an idea as to what could be going on? Thank you for your consideration! >>> First, thanks for providing context. Second, I don't think that's the right way to look at poor prediction. Instead, I would look at the particular cases that have very high residuals and see what they have in common, if anything. The residuals should not be related to the IVs. If they are, something is wrong with the model. If you wanted to model the residuals on the IVs, I would do it in OLS regression, one variable at a time, and looking at lots of plots. In fact, I might ONLY look at plots. A residual of 1.75 isn't some magic value. Third, if you do decide to go this way, crazy ORs and CIs are usually the result of zero cells or near zero cells in the crosstabs. So, I'd look variable by variable at crosstabs (if the IV is categorical), or at parallel box plots (if the IV is continuous). HTH Peter Peter Flom PhD. Peter Flom Consulting LLC 5 Penn Plaza, Ste 2342 NY NY 10001 www.statisticalanalysisconsulting.com www.IAmLearningDisabled.com ```

Back to: Top of message | Previous page | Main SAS-L page