Date: Thu, 23 Jun 2005 15:16:28 -0700
Reply-To: cassell.david@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: Concordant Discordant pair--What is it
In-Reply-To: <200506232119.j5NLJAl15812@cal1-1.us4.outblaze.com>
Content-type: text/plain; charset=US-ASCII
"Nick ." <ni14@MAIL.COM> wrote:
> I am now getting confused with the codes. David just listed a code
> which agrees with PROC LOGISTIC results. Another poster, Dan, caught
> the mistake!!! Smart of him to run my data through PROC LOGISTIC and
> catch this mistake. Someting I should have done myself!!! Oh well.
>
> . . . . . .
>
> >>>First, I missed a typo in Ian's code. The '<' in the CASE
statement
> >>>needs to be a '>'.
>
> >>>Second, the 'p' from the OUTPUT statement isn't the right
> >>>predicted value to be using. Is that what Nick is doing?
> >>>Oh dear.
>
> >>>Instead, use the SCORE statement and get the output from that.
> >>>You'll see when you print the output from the score statement
> >>>that you get a 'p' variable, a 'P_0' variable, and a 'P_1'
variable.
> >>>P_x is the posterior probability for the normalized response
variable
> >>>when it is equal to 'x'. So we want P_1 .
>
> >>>Now run Ian's code like this:
>
>
> proc sql ;
> create table q as
> select
> one.x as x1 , one.y as y1 , one.p_1 as pred1 ,
> z.x as x0 , z.y as y0 , z.p_1 as pred0 ,
> case
> when one.p_1 > z.p_1 then 1
> when one.p_1 = z.p_1 then 0
> else -1
> end as concordant
> from out2 as one , out2 as z
> where one.y = 1 and z.y = 0
> ;
> quit;
>
> proc freq data = q ;
> table concordant ;
> run ;
>
>
> Now you get a proc freq table with
>
> concordant Frequency Percent
> -1 24 60.00
> 1 16 40.00
>
>
> This agrees with the SAS output.
>
>
>
>
> David,
> In your above code where does p_1 come from? Do you mean to say
> "pred" (from my example data).
> Where does "from out2" come from? If I call my example data "out2"
> then that's the "out2", right?
>
>
> I modify your code as follows (I delete X since it is not used
> anywhere, only Y and PRED from my example data sre used:
>
> proc sql ;
> create table q as
> select
> one.y as y1 , one.pred as pred1 ,
> z.y as y0 , z.pred as pred0 ,
> case
> when one.pred > z.pred then 1
> when one.pred = z.pred then 0
> else -1
> end as concordant
> from out2 as one , out2 as z
> where one.y = 1 and z.y = 0
> ;
> quit;
>
> proc freq data = q ;
> table concordant ;
> run ;
>
>
> I don't get the correct results you are getting above, i.e.
>
> concordant Frequency Percent
> -1 24 60.00
> 1 16 40.00
>
>
> I get the same wrong results as before.
[1] Don't use the OUTPUT statement. Use the SCORE
statement. I didn't notice the first time I looked
at your code, but your data set is aparently from an
OUTPUT statement with no extra variables requested.
That isn't what you want.
[2] I used the SCORE statement instead. You need to get
the posterior probability for level 1, which is automatically
labeled P_1 in the output from the SCORE statement.
Here's my code:
proc logistic data=w(drop=pred);
model y = x ;
score out=out2;
run;
proc print data=out2 noobs; title 'score '; run;
proc sql ;
create table q as
select
one.x as x1 , one.y as y1 , one.p_1 as pred1 ,
z.x as x0 , z.y as y0 , z.p_1 as pred0 ,
case
when one.p_1 > z.p_1 then 1
when one.p_1 = z.p_1 then 0
else -1
end as concordant
from out2 as one , out2 as z
where one.y = 1 and z.y = 0
;
quit;
proc freq data = q ;
table concordant ;
run ;
I think that now you can see how I got those posterior
probabilities, and why their names looked different in the
PROC SQL step.
HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|