Date: Wed, 6 Sep 2006 14:24:04 -0700
Reply-To: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Subject: Re: Need help generating a count matrix
In-Reply-To: <1157575891.472788.105960@i3g2000cwc.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"
Clifford -
1) Make a recursive Cartesian join on ID=ID, renaming the second Contact
(to Contact2 or something)
2) Unduplicate (if you want to count unique contact pairs for each ID)
by ID, Contact, and Contact2 using a sort or in your SQL join.
3) Use Proc freq or summary (with nway) to generate the cells (probably
is possible also with SQL - I'm not an SQL head).
You can also do this in Excel if your data isn't too large.
hth
Paul Choate
DDS Data Extraction
(916) 654-2160
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
clifford
Sent: Wednesday, September 06, 2006 1:52 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Need help generating a count matrix
Hello all,
I would like to ask if anyone knows a smart way to create a matrix
which counts the number of unique observations that take on the various
values of a single categorical variable.
A simplified example: I have a dataset containing the records of
contacts to a store and the method of contact. Each instant of contact
is an observation in the dataset and is linked to an ID #. Every
contact is assigned a value (1, 2, or 3) designated the method of
contact, under a field called Contact. Therefore each ID can have
multiple records if the person initiated contact more than once.
I need to create a matrix that counts the number of unique IDs having
each value of the variable "Contact", such as this:
Contact 1 2 3
1 x a a
2 b x b
3 c c x
Where:
Contact is a categorical variable taking on values 1, 2, 3 which
indicates Method of Contact
x is the total number of unique ID's associated with Contact=1 (2 /
3)
a is the number of unique IDs with Contact=1 that also have Contact=2
/ number of IDs with Contact=1 that also have Contact=3
b is the number of unique IDs with Contact=2 that also have Contact=1
/ number of IDs with Contact=2 that also have Contact=3
etc.
Does anyone know of a slick way to program this into SAS? I would
greatly appreciate any input.
Thank you!