|Date: ||Sat, 20 Dec 2008 07:23:08 -0800|
|Reply-To: ||paul wilson <paulwilsn@YAHOO.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||paul wilson <paulwilsn@YAHOO.COM>|
|Subject: ||Proc Logistic - Individual Case Propensity Scoring and Predicted
|Content-Type: ||text/plain; charset=us-ascii|
My first goal is to estimate PROC LOGISTIC and then I'd like to look at CTABLE and see %correct values for every cuttoff value of p.
Once I found the best possible cut-off value (i.e. the one that classifies event=1 best), I'd like to create a dataset that will contain probabilities, odd ratios and
predicted group memberships (i.e. each case will have prediced value of either 1 meaning predicted death penalty or 0 - predicted no death penalty) based on the best cut off point found in CTABLE.
I'm guessing my code should be something like this:
proc logistic data = penalty;
model death (event='1')=blackd whitvic serious/ rsq influence ctable lackfit
output out=out1 predicted=p;
Dataset "out1" is created and "predicted=p" created a column in the dataset that looks like log-odds of death = 1.
Could someone please confirm these are indeed log odds?
Now my second step (and this is where I get confused)
needs to produce a dataset that will have log odds, odds ratios and group memberships appended to each case.
I thought PPROB is what I should use for the purpose of classifying cases into group 1 or 0 in the dataset, but looks like that option only isolates one line of CTABLE in the output and nothing else (of course I may be wrong).
Anyways, I gave it a shot and was wondering what you think about this code for the second step:
group = p GT 0.6;
Would this do the trick?
p GT 0.6 is just a place holder. Let's assume after running PROC LOGISTIC I looked at my CTABLE and determined that I would classify most cases correctly at 0.6 cut off point.
I'm assuming that p in this dataset corresponds to p in CTABLE but again I'm not sure it does, so
I'd greatly appreciate someone confirming that.
I'm also not sure how to include odds ratios for every case in this dataset.
Thanks for your help in advance!!