Date: Thu, 6 Jan 2011 17:49:21 -0600
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: help with matching without replacement
In-Reply-To: <201101062332.p06KicNs029630@wasabi.cc.uga.edu>
Content-Type: text/plain; charset=ISO-8859-1
How do you prevent id_match 3 from being in the sample twice? Does it
matter which id the single id_match comes from?
I probably would suggest a simple data step BY ID YEAR SIZE_DIF, taking
FIRST.YEAR, and using a hash table (or array, if unique # of id_match is
small) to mark each id_match as used already.
-Joe
On Thu, Jan 6, 2011 at 5:32 PM, Tracy Li <lisiqi77@yahoo.com> wrote:
> Hi,
>
> I have a question regarding how to select a matched sample without
> replacement.
>
> Suppose my data looks like the following:
>
> id year id_match size_dif
> a 2000 1 12
> a 2000 2 22
> a 2000 3 1
> a 2000 4 23
> a 2001 2 24
> a 2001 5 2
> a 2001 6 4
> b 2000 3 5
> b 2000 4 7
> b 2000 7 32
>
> My goal is to find one and only one 'id-match' with the smallest possible
> size difference 'size-dif' with 'id' for each id-year. And there should
> be no duplicates in the matched sample (that is, I don't want id_match=3
> in 2000 appear twice in the matched sample).
>
> Any ideas would be greatly appreciated!
>
> Tracy
>
|