Date: Fri, 6 Jun 2003 16:34:46 -0700
Reply-To: cassell.david@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: Again SURVEYSELECT for stratified sampling
Content-type: text/plain; charset=us-ascii
SASKID <s4a3s2k1id@WEB.DE> wrote:
> Original message was: "It's getting closer. Which SURVEYSELECT method
> would you recommend if you wanted a stratified sample from a finite
> population in such a way that the sample keeps all the strata
> proportions of the population? Maybe PPS? For example, if the
population
> has N=1000, and male/female are50%/50%, and big/small 80%20%, which
> sampling method could you recommend to reach at at sample of, say,
> N=100, and male/female are50%/50%, and big/small 80%20%, too?
Perhaps I do not grasp all of what you are tryingto accomplish,
but if you want the proportions in the sample to mimic those in the
population, just take a simple random sample. That would be
METHOD=SRS .
Of course, you will not get *exactly* the numbers you want when you
do simple random sampling. If you need exact numbers (for some
unspecified reason), then you could make your four categories
(male or female, big or small) into four strata. Then sample each of
your four strata at 10%, to get much closer to the exact proportions
in your population. Still, unless the counts in these strata are
multiples of ten, you are not going to get *exactly* one-tenth.
Assume you performed a PROC FREQ and found that the four categories
look like this:
stratum male/female big/small count
1 M B 131
2 M S 582
3 F B 198
4 F S 89
Now you need to create a stratum variable with exactly the above
numbers,
and name it something, perhaps STRAT1.
Then you could specify your stratum sizes for your sample using
something
like this:
proc sort data=YourFrame;
by strat1;
run;
proc surveyselect data=YourFrame out=YourSample sampsize=(13,58,20,9)
method=srs seed=193756345;
id <your list of ID vars to be towed along>;
strata strat1;
run;
Note that the stratum sizes will be chosen *in* *order*, so sort the
frame data by the stratum, and make sure the stratum numbers match up
with the strata. Choose your own random seed, too.
HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|