Date: Wed, 29 May 2002 17:00:42 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: merge multiple observations into a single observation w/o ove
rinflating matches
Content-Type: text/plain; charset="iso-8859-1"
I have heard that even the SAS Masters have to struggle with problems of
coalescing data records! Even so, if you can declare a solution as a logical
statement, the SAS SQL genie can implement it.
Bet you would find it easier to declare a solution if you could match up
unique ACTCODE=68 records to unique ACTCODE=64 records. Let SQL grant that
wish:
Create view ACT68vw as select distinct ID,partresp,evtype,actcode,finalact
from A
where actcode=68
;
Create view ACT64vw as select distinct ID,parttrig,evtype,actcode from A
where actcode=64
;
Now you can link up the two virtual tables:
Data A;
input
ID PARTTRIG: $char2. PARTRESP: $char2. EVTYPE: $char4. ACTCODE
FINALACT
;
cards;
304 T1 XX DLSM 64 .
304 JD XX FPRO 64 .
304 YY T1 DLSM 68 1
304 YY JD FPRO 68 2
1832 D1 XX FPRO 64 .
1832 D1 XX FPRO 64 .
1832 YY D1 FPRO 68 .
1832 YY D1 FPRO 68 2
1832 YY P2 SMJD 68 1
;
run;
proc sql;
Create vie ACT68vw as select distinct ID,partresp,evtype,actcode,finalact
from A
where actcode=68
;
Create view ACT64vw as select distinct ID,parttrig,evtype,actcode from A
where actcode=64
;
Create table Answer as
select t1.*,t2.parttrig,t2.evtype as evtype64,t2.actcode as actcode64
from ACT68vw as t1 left join ACT64vw as t2
on t1.ID=t2.ID and t1.partresp=t2.parttrig and t1.evtype=t2.evtype
;
quit;
The 'left join' form of 'merge' matches up the ACT64vw records that do in
fact match ACT68vw records and fills in data elements from both. It includes
as well the ACT68vw records that don't match any record in ACT64vw on
ID,part*, and evtype, and leaves the ACT64vw data elements missing (in SAS
or NULL in other systems). The rest of the final query amounts to a simple
restatement of what you asked for in SQL syntax.
Sig
-----Original Message-----
From: Shannon [mailto:swheatma@FJC.GOV]
Sent: Wednesday, May 29, 2002 3:37 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: merge multiple observations into a single observation w/o
overinflating matches
Greetings- I am wondering if anyone has any thoughts about how to
merge multiple observations into a single observation without
overinflating the true number of matches. Below is an example of a
dataset in its current form (DATASET A) and the desired result
(DATASET B).
DATASET A:
ID PARTTRIG PARTRESP EVTYPE ACTCODE
FINALACT
304 T1 DLSM 64 .
304 JD FPRO 64 .
304 T1 DLSM 68 1
304 JD FPRO 68 2
1832 D1 FPRO 64 .
1832 D1 FPRO 64 .
1832 D1 FPRO 68 .
1832 D1 FPRO 68 2
1832 P2 SMJD 68 1
The goal is to match the records with ACTCODE=68 to the corresponding
records with ACTCODE=64 by unique ID group, EVTYPE and where
PARTTRIG=PARTRESP. The thing to avoid is multiple matches that result
when records with similar matching criteria are merged using a MERGE
statement. I think the key is to use SQL and perhaps set up a flag
that prevents a record that is matched once from being matched a
second time with each iteration until all possible match combinations
are attained. An example of the desired result would be:
DATASET B:
ID PARTTRIG PARTRESP EVTYPE ACTCODE FINALACT
304 T1 T1 DLSM 68 1
304 JD JD FPRO 68 2
1832 D1 D1 FPRO 68 .
1832 D1 D1 FPRO 68 2
1832 P2 SMJD 68 1