|
Here's something I've been struggling with in terms of finding an efficient
way to do this (my methods are always too slow and time-consuming). Suppose
I have the following list:
1 2
1 3
1 1
1 4
1 6
2 3
2 4
2 1
2 3
2 6
where I will refer to the first column as ID and the second as ID2. What I
want is the simplest way to 'randomly' grab one record for each ID.
Efficiency is important as I'll be doing this on a large dataset and will
repeat it a 1000 times. My old method was to generate a random number for
each record, sort by that random number, then pick the first record in
another data step. This, however, turns out to be too time consuming.
Any ideas?
Thanks,
Jim
|