Date: Fri, 19 Jan 2007 17:43:48 +0200
Reply-To: BoraYavuz@HSBC.COM.TR
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Bora Yavuz <BoraYavuz@HSBC.COM.TR>
Subject: Calibration tests
In-Reply-To: <200701191447.l0JBmIYV013863@mailgw.cc.uga.edu>
Content-type: text/plain; charset=US-ASCII
Alok,
The intuitive way of showing that the model does not perform well on some
(or all!) of the score brackets is to plot the actual log(odds) and the
expected log(odds) for each score bracket. Non-overlapping lines (i.e.,
difference in scores and / or the y-axis intercept values) will indicate
poor model performance.
An approach to adjust the expected probabilities to in order to better
reflect the actual ones is to regress the actual (observed) 0/1 vector with
the score the model produces. You can use logistic regression again.
Using the output of this regression, you can figure out how to change the
slope of the expected log(odds) curve to better mimic the actual log(odds)
curve. If you deem this is still not enough to attain a good match of the
two lines, you can further adjust the intercept too.
Note that this "calibration" does not affect the scorecard characteristics
-- it just "fudges" the output score produced by them. So, I doubt this
will improve the predictive performance (i.e., discriminatory power, Gini)
of the model -- it will rather make the predictions somewhat more
realistic. If you think your model's predictive performance degraded "a
lot", you should consider redevelopment.
HTH,
Bora Y.
------------------------------
Date: Thu, 18 Jan 2007 21:38:47 -0800
From: Alok <alok.rustagi@GMAIL.COM>
Subject: Calibration tests
Hi all
I am working on a probability of default calibration assignment. Let me
explain the entire problem:
We calculate probability of default using existing risk scores.
Essentially meaning generating probability of default corresponding to
each score. The higher the risk score the lower should be the
probability of default. We use single variable logistic regression to
achieve a relationship between existing scores and probability of
default.
The problem is that the model is not able to perform very well at lower
as well as higher probability. The reason for this is that the
population at these end-point probabilities is very small as compared
to the population in the middle score ranges.
I need to calibrate the model as well as test the model stability. Can
any of you help me in suggesting some techniques/approaches which might
help me in decisively saying that the model is not performing as
expected at lower and higher values? I would also require some help in
calibrating the existing model (once proved not performing) so that the
model performance improves.
Thanks a lot in advance.
Alok
------------------------------