LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 1999, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 5 Aug 1999 16:46:14 +0100
Reply-To:   tra <tra@proteus.co.uk>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   tra <tra@PROTEUS.CO.UK>
Organization:   Proteus Molecular Design Ltd
Subject:   Re: A Challenging Problem [2]
Comments:   To: SAS-L@LISTSERV.VT.EDU
Content-Type:   text/plain; charset=us-ascii

John,

what an interesting problem.

I have been worrying about my previous solution. It will give the wrong answers for some data because the same period of observation for rate 1 may match more than one periods for rater 2 (after allowing the 1 second leeway).

I cannot see an easy and complete way to fix this up in sql, using start-end intervals.

If you can assume that all the start-end times are integers, then a simpler approach is possible (but at the cost of more computation).

Here is my second attept at a solution.

data test; length id code 8 key $ 1 start end 8; input id code key start end; datalines; 2001 1 V 04 10 2001 1 B 10 15 2001 1 V 15 16 2001 1 N 16 30 2001 1 P 17 17 2001 1 V 30 35 2001 1 \ 35 35 2002 1 V 10 15 2002 1 B 15 20 2002 1 P 17 17 2002 1 V 20 35 2002 1 \ 35 35 2001 2 V 05 10 2001 2 B 10 16 2001 2 V 16 17 2001 2 N 17 30 2001 2 P 18 18 2001 2 V 30 36 2001 2 \ 36 36 2002 2 V 9 15 2002 2 B 15 21 2002 2 P 17 17 2002 2 V 21 37 2002 2 \ 37 37 2003 1 X 17 25 2003 1 \ 25 25 2003 2 Y 18 20 2003 2 X 20 21 2003 2 X 21 22 2003 2 \ 22 22 2004 2 X 17 25 2004 2 \ 25 25 2004 1 Z 18 20 2004 1 X 20 21 2004 1 X 21 22 2004 1 \ 22 22 ; %print; /* discretised solution */ /* this is only sensible if all start and end times are integers */ /* again assume that 'P' is an 'event' */

/* expand data to a record for each key-second */ data tdisc; set test; drop start end; if key ne '\'; do time = start to end-1+(key in ('P')); output; end; run; /* same as above, but with 1-second leeway */ data tdisc1; set test; drop start end; if key ne '\'; do time = start-1 to end+(key in ('P')); output; end; run; proc sql; /* matchdsc - key-seconds in tdisc for which there is a match in tdisc1 */ create table matchdsc as select a.id, a.key, a.time from tdisc a, tdisc1 b where a.code = 1 and b.code = 2 and a.id = b.id and a.key = b.key and a.time = b.time group by a.id, a.key, a.time having count(*) > 0; /* commndsc - key-seconds in tdisc for which ther is a near match in tdisc1 near matches do NOT have to have the same key */ create table commndsc as select a.id, a.key, a.time from tdisc a, tdisc1 b where a.code = 1 and b.code = 2 and a.id = b.id and a.time = b.time group by a.id, a.key, a.time having count(*) > 0; /* mis-matches */ create table mismatch as select * from commndsc except select * from matchdsc; /* collect statistics */ create table kappa as select a.key, a.count as duration, max(b.count,0) as durmatch, calculated durmatch/calculated duration as kappa from ( select key, count(time) as count from commndsc group by key ) a left join ( select key, count(time) as count from matchdsc group by key ) b on a.key = b.key ; select * from kappa; quit;

JGerstle@SW.UA.EDU wrote:

> Greetings and Salutations All > I posted a question last week about sequential analyses via > SAS and received some responses that I could use SAS/ETS. > Thank you for the info. Unfortunately, we do not have this module. > > As my subject line indicates, I have somewhat a challenge to > any of you that have the time to come up with some sort of plan to > address the problem I'm going to relate. First, I should tell you that > I am using SAS 6.12 for Windows 95. What I'm looking for are > ideas and clues that will, hopefully, lead me to solve my problem. > OK, onto the actual query. > > I have a dataset (shown at the bottom) that was put together > from several flat files created via a BASICA program that we use to > do our observational data collection on residents in nursing homes. > The program, in a nutshell, asks for certain header info (name, id, > date, etc..) and then starts recording, using the internal clock of > the laptop, the start and stop times, in seconds, of several different > behaviors which are represented by various keys on the keyboard > (i.e. 'V' for disruptive behavior, 'B' for talking to self, 'N' for talking to > another resident, etc.). > I wrote a SAS program (I will send a copy for any that are > interested personally) that wil read in the hundreds of flat files > containing this info and separate the header and data portions of > the files into separate datasets. Then I can simply do the analyses > I need to do (like keypercents). The problem I need to address is a > way to calculate reliability kappas for a pair of primary and rely flat > files. The procedure we have now uses a couple of Pascal > programs, but the composer of these programs does not work with > us anymore and we have the need to modify how we calculate our > kappas. > Now some of our keys are event keys, only 'on' for a second, > while the rest are duration keys, 'on' for several seconds. We want > to give a one second window on either side of both types of keys > so if one of the raters is off by a second with the onset of a key, > the kappa program will take this into account and not discount the > lost second. > > Here's a sample dataset with variable names ID, Primary/Rely > Code (1 for primary, 2 for rely), KEY, START time, END time: > (Keys V, B, & N are duration and key P is event, / is used aas > end of file). The length of the file (the total number of seconds) and > the number of lines of data are the last two lines of the header, > which can be merged with the data and used. > > 2001 1 V 04 10 > 2001 1 B 10 15 > 2001 1 V 15 16 > 2001 1 N 16 30 > 2001 1 P 17 17 > 2001 1 V 30 35 > 2001 1 \ 35 35 > 2002 1 V 10 15 > 2002 1 B 15 20 > 2002 1 P 17 17 > 2002 1 V 20 35 > 2002 1 \ 35 35 > 2001 2 V 05 10 > 2001 2 B 10 16 > 2001 2 V 16 17 > 2001 2 N 17 30 > 2001 2 P 18 18 > 2001 2 V 30 36 > 2001 2 \ 36 36 > 2002 2 V 9 15 > 2002 2 B 15 21 > 2002 2 P 17 17 > 2002 2 V 21 37 > 2002 2 \ 37 37 > > etc...... > > Thank you much for any ideas and leads that you may come up > with... > > John Gerstle > Program Analyst, Sr. > Applied Gerontology Program > University of Alabama

-- T R Auton PhD MSc C.Math Head of Biomedical Statistics Proteus Molecular Design Ltd Beechfield House Lyme Green Business Park Macclesfield Cheshire SK11 0JL UK email: tra@proteus.co.uk


Back to: Top of message | Previous page | Main SAS-L page