LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (April 2003, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 9 Apr 2003 14:25:05 -0400
Reply-To:     sashole@bellsouth.net
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Paul M. Dorfman" <sashole@BELLSOUTH.NET>
Organization: Sashole of Florida
Subject:      Re: Sampling Question
Comments: To: diskin.dennis@kendle.com
In-Reply-To:  <OF46C1FD04.C1102C55-ON85256D03.005F2ECB@kendle.com>
Content-Type: text/plain; charset="us-ascii"

> -----Original Message----- > From: diskin.dennis@kendle.com [mailto:diskin.dennis@kendle.com] > > Paul, > > I'm afraid your algorithim is flawed. It allows duplicates to > be selected.

Dennis,

The algorithm is perfect; the implementation is plagued with absent-mindedness: I forgot to surround _n_ with parentheses in

ptr (_p_) = _n_ ;

which should be

ptr (_p_) = ptr (_n_) ;

But wait: Did not I just write the same in response to Dale? I guess I did. Here goes the memory, again... :-(.

Kind regards, =================== Paul M. Dorfman Jacksonville, FL ===================

> From: Paul Dorfman <paul_dorfman@HOTMAIL.COM>@LISTSERV.UGA.EDU> on > 04/09/2003 12:36 PM > > > >From: Action Man <wollo_desse@HOTMAIL.COM> > > > >I have 7,000 records in my SAS file. Out of these records I want to > >pick 500 of them randomly, >How do I do that using SAS. > > Wollo, > > You will no doubt get (or have already gotten) plenty of > advice how to do it using the "standard" K/N method, where K > is the sample size and N is the population size. It is based > on reading all N records from the population file. It is > plenty sufficient and fast in your case, where N=7000 only > and K=500 is not a tiny fraction of N. Below is an > alternative approach allowing to obtain the sample by reading > K records only, which may be preferable for large Ns (easily > up to, say, 5E+6 with modern memory sizes) and/or K/N << > 1: > > %let n_pop = 7000 ; > %let n_smpl = 500 ; > > data pop ; > array vars (*) a b v03-v11 ; > do a = 1 to &n_pop ; > do b = 3 to 11 ; > vars (b) = ceil (ranuni(1) * 1e11) ; > end ; > output ; > end ; > run ; > > data sample (drop = _:) ; > array ptr (&n_pop) _temporary_ ; > > do _p_ = 1 to hbound (ptr) ; > ptr (_p_) = _p_ ; > end ; > > do _n_ = &n_pop to &n_pop - &n_smpl + 1 by -1 ; > _p_ = ceil (ranuni(1) * _n_) ; > point = ptr (_p_) ; > set pop point = point ; > output ; > ptr (_p_) = _n_ ; > end ; > > stop ; > run ;


Back to: Top of message | Previous page | Main SAS-L page