LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2006, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 25 May 2006 22:35:53 -0700
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: replicated sampling
In-Reply-To:  <200605251910.k4PFlewU007115@mailgw.cc.uga.edu>
Content-Type: text/plain; format=flowed

datamatter@GMAIL.COM wrote: >Does anyone has experience using replicated sampling under PROC >SURVEYSELECT? I'd like to know its efficiency if I'm drawing say 500 >samples from a gigantic data set (say tens of millions of records). >How many times would it loop through the data set? > >Thanks >DM

How many loops through the data set? Umm, maybe 500.

If your data are too large to fit in RAM, then you can't use the SASFILE statement to speed things up.

So let me ask you a question. What are you doing here? A bootstrap? A simulation?

The alternative is probably nearly as bad: a large process which maintains a record of the sampling process for each of your 500 replicates, so that you try each record 500 times and spit out a copy for every replicate which 'hits' that record. So you only make one pass through the data. Then, afterward, you sort by replicate. This is manageable as long as you stick with simple random sampling or simple random sampling with replacement. Even if you have to write the code by hand.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ On the road to retirement? Check out MSN Life Events for advice on how to get there! http://lifeevents.msn.com/category.aspx?cid=Retirement


Back to: Top of message | Previous page | Main SAS-L page