LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2010, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 2 Nov 2010 14:42:33 -0400
Reply-To:     peterflomconsulting@mindspring.com
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Flom <peterflomconsulting@MINDSPRING.COM>
Subject:      Re: huge (>999.99) odds ratios: cause?
Comments: To: Jordan H <jihool3670@GMAIL.COM>
In-Reply-To:  <AANLkTikrK25XqCVWCzO3WVO786YiARrhGy83pizDYdFa@mail.gmail.com>
Content-Type: text/plain; charset="us-ascii"

Jordan H wrote

<<< Hello, all.

First, a little background. I've been asked to help with a project in which the goal to develop a model that predicts high cost pharmacy expenditures based on a variety of variables, such co-morbidities, demographics, etc. To do this, a multivariate regression model was used. My client is also interested in trying to model poor prediction within the multiple regression model. To do this, they saved the residuals from PROC REG, made an indicator variable for those observations with residuals greater than 1.75, and ran a PROC LOGISTIC with the new indicator variable as the response variable and the original independent variables, plus additional cost variables, as predictors.

The model converges and most coefficients/odds ratios look reasonable but some appear to be errors (odds ratios of >999.99, confidence intervals (<0.001 - >999.99). We've checked things like multicollinearity but that doesn't seem to be an issue.

Does anyone have an idea as to what could be going on?

Thank you for your consideration! >>>

First, thanks for providing context.

Second, I don't think that's the right way to look at poor prediction. Instead, I would look at the particular cases that have very high residuals and see what they have in common, if anything. The residuals should not be related to the IVs. If they are, something is wrong with the model. If you wanted to model the residuals on the IVs, I would do it in OLS regression, one variable at a time, and looking at lots of plots. In fact, I might ONLY look at plots. A residual of 1.75 isn't some magic value.

Third, if you do decide to go this way, crazy ORs and CIs are usually the result of zero cells or near zero cells in the crosstabs. So, I'd look variable by variable at crosstabs (if the IV is categorical), or at parallel box plots (if the IV is continuous).

HTH

Peter

Peter Flom PhD. Peter Flom Consulting LLC 5 Penn Plaza, Ste 2342 NY NY 10001 www.statisticalanalysisconsulting.com www.IAmLearningDisabled.com


Back to: Top of message | Previous page | Main SAS-L page