Here's something I've been struggling with in terms of finding an efficient
way to do this (my methods are always too slow and time-consuming). Suppose
I have the following list:
where I will refer to the first column as ID and the second as ID2. What I
want is the simplest way to 'randomly' grab one record for each ID.
Efficiency is important as I'll be doing this on a large dataset and will
repeat it a 1000 times. My old method was to generate a random number for
each record, sort by that random number, then pick the first record in
another data step. This, however, turns out to be too time consuming.