Date: Mon, 13 Mar 2000 19:56:35 GMT
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject: Re: Efficient way to randomly grab one record from a group of
Content-Type: text/plain; format=flowed
I have tested several different methods; this one seems to execute most
rapidly (first step just concocts a test case):
1 data a;
2 do id=1 to 1000;
3 do x=1 by ceil(ranuni(1)*10) to 3000;
NOTE: The data set WORK.A has 873015 observations and 2 variables.
NOTE: The DATA statement used 0.69 CPU seconds and 3610K.
8 data r;
9 set a;
10 by id;
11 if first.id then p = _n_;
12 if last.id;
13 p ++ int(ranuni(0)*(_n_- p));
14 set a point=p;
NOTE: The data set WORK.R has 1000 observations and 2 variables.
NOTE: The DATA statement used 1.94 CPU seconds and 3956K.
Paul M. Dorfman
>From: Jim Linck <linck@SSB.ROCHESTER.EDU>
>Reply-To: Jim Linck <linck@SSB.ROCHESTER.EDU>
>Subject: Efficient way to randomly grab one record from a group of records
>Date: Mon, 13 Mar 2000 13:04:27 -0500
>Here's something I've been struggling with in terms of finding an efficient
>way to do this (my methods are always too slow and time-consuming).
>I have the following list:
>where I will refer to the first column as ID and the second as ID2. What I
>want is the simplest way to 'randomly' grab one record for each ID.
>Efficiency is important as I'll be doing this on a large dataset and will
>repeat it a 1000 times. My old method was to generate a random number for
>each record, sort by that random number, then pick the first record in
>another data step. This, however, turns out to be too time consuming.
Get Your Private, Free Email at http://www.hotmail.com