Date: Thu, 23 Jun 2005 15:24:16 -0700
Reply-To: cassell.david@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: Concordant Discordant pair--What is it
In-Reply-To: <200506232043.j5NKhEU12096@cal1-1.us4.outblaze.com>
Content-type: text/plain; charset=US-ASCII
"Nick ." <ni14@MAIL.COM> wrote back:
> I was looking at Ian's and Chang's code more closely and both work
> fine except for one big problem, which was perhaps my fault. Or they
> may still may not be a problem but here it goes:
>
> In the example below I only made up a few pairs, i.e. I only put
> down like 13 observations. In my real data set, I have close to 1
> million observations, 970K of which have a y = 0 (RESPONSE = 0 No,
> non-event) and about 23K have a y = 1 (RESPONSE = 1 Yes event).
>
> I think Ian's code performs a cartesian product and I just used
> Chang's code using only data steps. When I multiply 970K * 23K =
> about 22 billion total pairs!!! My computer (Solaris box) is running
> out of space. I hope these codes (Ian's and Chang's) don't write out
> 22 billion records anywhere!!!
Well, there's no getting around the fact that you need to do
22 billion pairwise comparisons, and you need to look at perhaps
significantly more records in order to find all those pairs.
One thing you could do is re-write the PROC SQL statements so that
the 22 billion records are only totaled (as the number of -1, 0, and 1
occurrences), instead of written to the data set q. The drive space
and RAM will still be taxed as all 22 billion records run through your
machine.
HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
> > proc sql ;
> > create table q as
> > select
> > one.x as x1 , one.y as y1 , one.pred as pred1 ,
> > z.x as x0 , z.y as y0 , z.pred as pred0 ,
> > case
> > when one.pred > z.pred then 1
> > when one.pred = z.pred then 0
> > else -1
> > end as concordant
> > from w as one , w as z
> > where one.y = 1 and z.y = 0
|