LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2003, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 9 Apr 2003 10:32:21 -0700
Reply-To:     Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:      Re: Sampling Question
Comments: To: sashole@bellsouth.net
In-Reply-To:  <BAY2-F8ccYTjr8cU6xf00002224@hotmail.com>
Content-Type: text/plain; charset=us-ascii

Paul,

The algorithm which you are proposing performs sampling with replacement: a record may be picked more than one time. Now, there are occasions when we do want to do sampling with replacement, but my guess is that this is not one of them. In order to guarantee sampling without replacement (records can only be selected once), one could use a hash, right?

Dale

--- Paul Dorfman <paul_dorfman@HOTMAIL.COM> wrote: > >From: Action Man <wollo_desse@HOTMAIL.COM> > > > >I have 7,000 records in my SAS file. Out of these records I want to > pick > >500 of them randomly, >How do I do that using SAS. > > Wollo, > > You will no doubt get (or have already gotten) plenty of advice how > to do it > using the "standard" K/N method, where K is the sample size and N is > the > population size. It is based on reading all N records from the > population > file. It is plenty sufficient and fast in your case, where N=7000 > only and > K=500 is not a tiny fraction of N. Below is an alternative approach > allowing > to obtain the sample by reading K records only, which may be > preferable for > large Ns (easily up to, say, 5E+6 with modern memory sizes) and/or > K/N << 1: > > %let n_pop = 7000 ; > %let n_smpl = 500 ; > > data pop ; > array vars (*) a b v03-v11 ; > do a = 1 to &n_pop ; > do b = 3 to 11 ; > vars (b) = ceil (ranuni(1) * 1e11) ; > end ; > output ; > end ; > run ; > > data sample (drop = _:) ; > array ptr (&n_pop) _temporary_ ; > > do _p_ = 1 to hbound (ptr) ; > ptr (_p_) = _p_ ; > end ; > > do _n_ = &n_pop to &n_pop - &n_smpl + 1 by -1 ; > _p_ = ceil (ranuni(1) * _n_) ; > point = ptr (_p_) ; > set pop point = point ; > output ; > ptr (_p_) = _n_ ; > end ; > > stop ; > run ; > > Kind regards, > --------------------- > Paul M. Dorfman > Jacksonville, FL > --------------------- > > > > _________________________________________________________________ > Tired of spam? Get advanced junk mail protection with MSN 8. > http://join.msn.com/?page=features/junkmail

===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com


Back to: Top of message | Previous page | Main SAS-L page