Date: Wed, 23 Oct 2002 15:34:46 -0400
Reply-To: "Huang, Ya" <ya.huang@PFIZER.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Huang, Ya" <ya.huang@PFIZER.COM>
Subject: Re: Querying Data without Replacement - Proc SQL?
I might have overlooked something:
input x y;
group by x
From: Brian Preslopsky [mailto:m1bkp00@FRB.GOV]
Sent: Tuesday, October 22, 2002 2:45 PM
Subject: Querying Data without Replacement - Proc SQL?
Let me give an example of my problem. Imagine a dataset:
For each x I want to pick y s.t. x-y = min(abs(x-y)). The simple
resolution is that all x's pick y=1. The wrinkle is that y must be
unique. So essentially I want to go through each x and pick a y, then
exlude that y from the possible matches the subsequent x's can choose
This example may seem simple, but I need to do this efficiently. I
actually have a data set of about half a million observations. It was
generated using proc sql, and I was looking for a sql solution. So far
I have only come up with something using nested subqueries, but these
use an impractical amount of computing time; on a smaller test dataset
of only 4000 observations, I am already up to 10 minutes.
I think I can come up with a macro to do this for me, it just seemed to
me to be something I should be able to do with sql. I am not a big sql
expert, so I might be missing something obvious.
If anyone has done any kind of querying without replacement such as what
I have described, I would be interested in hearing from you.