Date: Thu, 5 Aug 1999 16:46:14 +0100 tra "SAS(r) Discussion" tra Proteus Molecular Design Ltd Re: A Challenging Problem [2] To: SAS-L@LISTSERV.VT.EDU text/plain; charset=us-ascii

John,

what an interesting problem.

I have been worrying about my previous solution. It will give the wrong answers for some data because the same period of observation for rate 1 may match more than one periods for rater 2 (after allowing the 1 second leeway).

I cannot see an easy and complete way to fix this up in sql, using start-end intervals.

If you can assume that all the start-end times are integers, then a simpler approach is possible (but at the cost of more computation).

Here is my second attept at a solution.

data test; length id code 8 key \$ 1 start end 8; input id code key start end; datalines; 2001 1 V 04 10 2001 1 B 10 15 2001 1 V 15 16 2001 1 N 16 30 2001 1 P 17 17 2001 1 V 30 35 2001 1 \ 35 35 2002 1 V 10 15 2002 1 B 15 20 2002 1 P 17 17 2002 1 V 20 35 2002 1 \ 35 35 2001 2 V 05 10 2001 2 B 10 16 2001 2 V 16 17 2001 2 N 17 30 2001 2 P 18 18 2001 2 V 30 36 2001 2 \ 36 36 2002 2 V 9 15 2002 2 B 15 21 2002 2 P 17 17 2002 2 V 21 37 2002 2 \ 37 37 2003 1 X 17 25 2003 1 \ 25 25 2003 2 Y 18 20 2003 2 X 20 21 2003 2 X 21 22 2003 2 \ 22 22 2004 2 X 17 25 2004 2 \ 25 25 2004 1 Z 18 20 2004 1 X 20 21 2004 1 X 21 22 2004 1 \ 22 22 ; %print; /* discretised solution */ /* this is only sensible if all start and end times are integers */ /* again assume that 'P' is an 'event' */

/* expand data to a record for each key-second */ data tdisc; set test; drop start end; if key ne '\'; do time = start to end-1+(key in ('P')); output; end; run; /* same as above, but with 1-second leeway */ data tdisc1; set test; drop start end; if key ne '\'; do time = start-1 to end+(key in ('P')); output; end; run; proc sql; /* matchdsc - key-seconds in tdisc for which there is a match in tdisc1 */ create table matchdsc as select a.id, a.key, a.time from tdisc a, tdisc1 b where a.code = 1 and b.code = 2 and a.id = b.id and a.key = b.key and a.time = b.time group by a.id, a.key, a.time having count(*) > 0; /* commndsc - key-seconds in tdisc for which ther is a near match in tdisc1 near matches do NOT have to have the same key */ create table commndsc as select a.id, a.key, a.time from tdisc a, tdisc1 b where a.code = 1 and b.code = 2 and a.id = b.id and a.time = b.time group by a.id, a.key, a.time having count(*) > 0; /* mis-matches */ create table mismatch as select * from commndsc except select * from matchdsc; /* collect statistics */ create table kappa as select a.key, a.count as duration, max(b.count,0) as durmatch, calculated durmatch/calculated duration as kappa from ( select key, count(time) as count from commndsc group by key ) a left join ( select key, count(time) as count from matchdsc group by key ) b on a.key = b.key ; select * from kappa; quit;

JGerstle@SW.UA.EDU wrote: