```Date: Fri, 19 Jan 2007 17:43:48 +0200 Reply-To: BoraYavuz@HSBC.COM.TR Sender: "SAS(r) Discussion" From: Bora Yavuz Subject: Calibration tests In-Reply-To: <200701191447.l0JBmIYV013863@mailgw.cc.uga.edu> Content-type: text/plain; charset=US-ASCII Alok, The intuitive way of showing that the model does not perform well on some (or all!) of the score brackets is to plot the actual log(odds) and the expected log(odds) for each score bracket. Non-overlapping lines (i.e., difference in scores and / or the y-axis intercept values) will indicate poor model performance. An approach to adjust the expected probabilities to in order to better reflect the actual ones is to regress the actual (observed) 0/1 vector with the score the model produces. You can use logistic regression again. Using the output of this regression, you can figure out how to change the slope of the expected log(odds) curve to better mimic the actual log(odds) curve. If you deem this is still not enough to attain a good match of the two lines, you can further adjust the intercept too. Note that this "calibration" does not affect the scorecard characteristics -- it just "fudges" the output score produced by them. So, I doubt this will improve the predictive performance (i.e., discriminatory power, Gini) of the model -- it will rather make the predictions somewhat more realistic. If you think your model's predictive performance degraded "a lot", you should consider redevelopment. HTH, Bora Y. ------------------------------ Date: Thu, 18 Jan 2007 21:38:47 -0800 From: Alok Subject: Calibration tests Hi all I am working on a probability of default calibration assignment. Let me explain the entire problem: We calculate probability of default using existing risk scores. Essentially meaning generating probability of default corresponding to each score. The higher the risk score the lower should be the probability of default. We use single variable logistic regression to achieve a relationship between existing scores and probability of default. The problem is that the model is not able to perform very well at lower as well as higher probability. The reason for this is that the population at these end-point probabilities is very small as compared to the population in the middle score ranges. I need to calibrate the model as well as test the model stability. Can any of you help me in suggesting some techniques/approaches which might help me in decisively saying that the model is not performing as expected at lower and higher values? I would also require some help in calibrating the existing model (once proved not performing) so that the model performance improves. Thanks a lot in advance. Alok ------------------------------ ```

Back to: Top of message | Previous page | Main SAS-L page