LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 1997)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 27 Dec 1997 18:52:41 GMT
Reply-To:     Richard F Ulrich <wpilib+@PITT.EDU>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From:         Richard F Ulrich <wpilib+@PITT.EDU>
Organization: University of Pittsburgh
Subject:      Re: Test-retest Reliability: intraclass coefficient

<< : >> Randolph Stephenson ( wrote:

: Regarding the test-retest reliability, my problem is similar to Mary : Tullu' posting. However, I learned that I could do an intraclass : corelation to evaluate the test-retest reliability. I read that the rho : coefficient is an intraclass reliability. I know that David Nichols put : an intraclass reliability macro available on the SPSS site: I got it. I : have SPSS 7.5.1 for PC. I have a hard time to implement his macro.

: my data is very similar to the folowing:

: Participants Time Q1 Q2 Q3 to Q17 : 001 1 4 5 4 4 : 002 1 3 4 3 3 : 001 2 4 4 5 3 : etc.

: A repeated measure ANOVA seems in order but how to implement it. I think : that the rho to use is the mixed-model since Q1 to Q17 are fixed and the : Participants are random. << snip, the rest >>

- Convention says that we use slightly different terminology for the two cases that you are dealing with here, even though the math overlaps greatly.

"Internal reliability" of a scale is often measured by Cronbach's coefficient alpha. It is relevent when you will compute a total score and you want to know its reliability, based on no other rating. The "reliability" is *estimated* from the average correlation, and from the number of items, since a longer scale will (presumably) be more reliable. Whether the items have the same means is not usually important

For "inter-rater" reliability, one distinction is that the importance lies with the reliability of the single rating.

For examining your own data, I think you cannot do better than looking at the paired t-test and Pearson correlations between each pair of raters - the t-test tells you whether the means are different, while the correlation tells you whether the judgments are otherwise consistent.

Unlike the Pearson, the "intra-class" correlation assumes that the raters do have the same mean. It is not bad as an overall summary, and it is precisely what some editors do want to see presented for reliability across raters. It is both a plus and a minus, that there are *several* different formulas for intraclass correlation, depending on whose reliability is being estimated -

For purposes such as planning the Power for a proposed study, it does matter whether the raters to be used will be exactly the same individuals.

Hope this helps.

Rich Ulrich, biostatistician Univ. of Pittsburgh

Back to: Top of message | Previous page | Main SPSSX-L page