Date:  Wed, 10 Sep 2003 10:20:58 0700 
ReplyTo:  Dale McLerran <stringplayer_2@YAHOO.COM> 
Sender:  "SAS(r) Discussion" <SASL@LISTSERV.UGA.EDU> 
From:  Dale McLerran <stringplayer_2@YAHOO.COM> 
Subject:  Re: PROC FREQ and KAPPA 

InReplyTo:  <B2895F594E95D511A52100508BA36E6E0D9E44@EXCHANGESERVER> 
ContentType:  text/plain; charset=usascii 

Rodney,
You can obtain a square matrix (force the missing response
levels in each dimension to appear in your contingency table)
through a little trick. The trick is this: add a weight
variable with value 1 for all observed responses. Supplement
the observed data with some data which you have constructed
such that the entire table gets completed in both dimensions.
For this supplemental data, give the observations a very
small weight, say 1E12. Now, run your PROC FREQ as before,
but include a weight statement. For suitably small weights
for the amended (manufactured) data, you will not affect
the kappa statistic. I demonstrate using the data you have
problems with.
data one;
input x;
cards;
1
2
3
.
4
5
;
data two;
input y;
cards;
1
2
3
.
5
6
;
data both;
merge one two;
run;
/* Add weights and manufactured data */
data amended;
set both end=lastrec;
weight = 1;
output;
if lastrec then do; /* This is the manufactured data portion */
do x=1 to 6;
y=x;
weight = 1E12;
output;
end;
end;
run;
proc freq data=amended;
weight weight;
tables x * y / missing agree ;
run;
Note that one can perform a sensitivity analysis by
modifying the value of the weight variable for the
manufactured data. The value of kappa remains the same
(as do the standard errors and asymptotic confidence limits)
if the weight value for the manufactured data is as large
as 1E4. When the weight variable increases to 1E3 in the
manufactured data, then kappa increases (albeit by a very small
amount). From the sensitivity analysis, one may be assured
that the inclusion of the manufactured data did not affect
the kappa computation, except in that it allowed the
contingency table to be properly constructed with a row having
value X=6 and a column with value Y=4. The combinations
X=4, Y=5 and X=5, Y=6 were moved off the main diagonal so
that kappa is no longer 1.
Dale
 rpresley <rpresley@GMCF.ORG> wrote:
> SASLers,
>
> I am confused by the results of PROC FREQ and the KAPPA option.
>
> data one;
> input x;
> cards;
> 1
> 2
> 3
> .
> 5
> 5
> ;
> %runn;
> data two;
> input y;
> cards;
> 1
> 2
> 3
> .
> 5
> 5
> ;
> %runn;
> data both;
> merge one two;
> %runn;
> proc freq data=both;
> tables x * y / missing agree ;
> %runn;
>
>
> As expected the Kappa value in this instance is 1.0. But the Kappa
> value is
> also 1.0 for the following instance.
>
>
> data one;
> input x;
> cards;
> 1
> 2
> 3
> .
> 4
> 5
> ;
> %runn;
> data two;
> input y;
> cards;
> 1
> 2
> 3
> .
> 5
> 6
> ;
> %runn;
> data both;
> merge one two;
> %runn;
> proc freq data=both;
> tables x * y / missing agree ;
> %runn;
>
>
> It is true that the cross tabulation table produced by both instances
> is
> square. But clearly there is not perfect agreement between x and y
> in the
> second instance. This problem might be resolved if there were a way
> to
> force SAS to include ALL the values of X and Y as levels in the
> column and
> row dimensions.
>
> Any suggestions would be appreciated.
>
> Rodney
>
> Rodney J. Presley, PhD
> Director of Data Analysis
> Georgia Medical Care Foundation
> 1455 Lincoln Parkway
> suite 800
> Atlanta, GA 30346
>
> 6785273474
> 6785273574 fax
>
> rpresley@gmcf.org
=====

Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 6672926
Fax: (206) 6675977

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder  Free, easytouse web site design software
http://sitebuilder.yahoo.com
