LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2006, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 13 Jan 2006 14:53:08 -0800
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: Creating a random sample from a flat file
In-Reply-To:  <200601132212.k0DKjK37013783@mailgw.cc.uga.edu>
Content-Type: text/plain; format=flowed

k_monal_99@YAHOO.COM wrote: >Does someone know how to create a random sample from a flat file, >like reading the file and simultaneously outputting a random sample . >I have never done it ,but I have to do it now because >the files from which I want to create the random >samples are in millions and I don't want to load them >to a dataset and then do the process again.

Well, what do you need to do with the random sample? Does it even need to be random? Are you looking for test data? If you really need a random sample that meets some particular probability requirements, then it might not be easy to do it in the data step. (Or maybe I'm just trying to find a way to sneak in PROC SURVEYSELECT when you're not looking. :-)

If all you need is a bunch of records and the exact count doesn't matter, then just add a line like this toward the bottom of your data step:

if ranuni(37474) < 0.02 then output;

and about 2% of your records will get spit out. But this won't give you an exact count. Just 'about' the proportion you list in the inequality.

If you are concerned about getting test data to ensure that your code handles all boundary cases, then you may need some very different approaches to handling this. As I said, it depends on what you need the sample for.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar – get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/


Back to: Top of message | Previous page | Main SAS-L page