Date: Wed, 10 Sep 2003 10:20:58 -0700 Dale McLerran "SAS(r) Discussion" Dale McLerran Re: PROC FREQ and KAPPA To: rpresley text/plain; charset=us-ascii

Rodney,

You can obtain a square matrix (force the missing response levels in each dimension to appear in your contingency table) through a little trick. The trick is this: add a weight variable with value 1 for all observed responses. Supplement the observed data with some data which you have constructed such that the entire table gets completed in both dimensions. For this supplemental data, give the observations a very small weight, say 1E-12. Now, run your PROC FREQ as before, but include a weight statement. For suitably small weights for the amended (manufactured) data, you will not affect the kappa statistic. I demonstrate using the data you have problems with.

data one; input x; cards; 1 2 3 . 4 5 ; data two; input y; cards; 1 2 3 . 5 6 ; data both; merge one two; run;

/* Add weights and manufactured data */ data amended; set both end=lastrec; weight = 1; output; if lastrec then do; /* This is the manufactured data portion */ do x=1 to 6; y=x; weight = 1E-12; output; end; end; run;

proc freq data=amended; weight weight; tables x * y / missing agree ; run;

Note that one can perform a sensitivity analysis by modifying the value of the weight variable for the manufactured data. The value of kappa remains the same (as do the standard errors and asymptotic confidence limits) if the weight value for the manufactured data is as large as 1E-4. When the weight variable increases to 1E-3 in the manufactured data, then kappa increases (albeit by a very small amount). From the sensitivity analysis, one may be assured that the inclusion of the manufactured data did not affect the kappa computation, except in that it allowed the contingency table to be properly constructed with a row having value X=6 and a column with value Y=4. The combinations X=4, Y=5 and X=5, Y=6 were moved off the main diagonal so that kappa is no longer 1.

Dale

--- rpresley <rpresley@GMCF.ORG> wrote: > SAS-Lers, > > I am confused by the results of PROC FREQ and the KAPPA option. > > data one; > input x; > cards; > 1 > 2 > 3 > . > 5 > 5 > ; > %runn; > data two; > input y; > cards; > 1 > 2 > 3 > . > 5 > 5 > ; > %runn; > data both; > merge one two; > %runn; > proc freq data=both; > tables x * y / missing agree ; > %runn; > > > As expected the Kappa value in this instance is 1.0. But the Kappa > value is > also 1.0 for the following instance. > > > data one; > input x; > cards; > 1 > 2 > 3 > . > 4 > 5 > ; > %runn; > data two; > input y; > cards; > 1 > 2 > 3 > . > 5 > 6 > ; > %runn; > data both; > merge one two; > %runn; > proc freq data=both; > tables x * y / missing agree ; > %runn; > > > It is true that the cross tabulation table produced by both instances > is > square. But clearly there is not perfect agreement between x and y > in the > second instance. This problem might be resolved if there were a way > to > force SAS to include ALL the values of X and Y as levels in the > column and > row dimensions. > > Any suggestions would be appreciated. > > Rodney > > Rodney J. Presley, PhD > Director of Data Analysis > Georgia Medical Care Foundation > 1455 Lincoln Parkway > suite 800 > Atlanta, GA 30346 > > 678-527-3474 > 678-527-3574 fax > > rpresley@gmcf.org

===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com

Back to: Top of message | Previous page | Main SAS-L page