Date: Tue, 24 Feb 2004 11:55:04 -0800
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: PROC SURVEYSELECT and SAMPSIZE=
In-Reply-To: <5.0.2.1.2.20040224130421.032327f0@mail.utexas.edu>
Content-Type: text/plain; charset=us-ascii
Laura,
I don't know if SURVEYSELECT can generate the second stage sample
that you need. However, there is a simple workaround that you
can employ using two datasteps and a proc sort. In the first
datastep, attach a random number to your data. Sort the data
by PSUID and random number. Finally, set the data by PSUID and
keep a counter of the number of observations seen in that PSUID.
If the observations number within PSUID is <=10, then output the
record.
%let seed=....;
data samp_stg1;
retain seed &seed;
set samp_stg1;
call ranuni(seed,x);
drop seed;
run;
proc sort data=samp_stg1;
by psuid x;
run;
data samp_stg2;
set samp_stg1;
by psuid;
if first.psuid then nrec_psu=0;
nrec_psu+1;
if nrec_psu<=10 then output;
drop x nrec_psu;
run;
Dale
--- Laura Stapleton <laura.stapleton@MAIL.UTEXAS.EDU> wrote:
> Colleagues-
>
> I have a possibly basic issue that I cannot solve on my own and would
> be
> glad for any suggestions.
>
> I would like to pull a 2-stage sample with PPS of PSUs and then
> disproportionate sampling across two strata within each PSU,
> resulting in
> about 20 units within each PSU.
>
> My plan was to undertake a PROC SURVEYSELECT using PPS on the PSU
> IDs, then
> merge back with the full dataset on PSU ID (only retaining IDs in the
> selected PSUs).
>
> Then, undertake a second PROC SURVEYSELECT using SRS with
> STRATA = PSUID UNIT_TYPE;
> and specifying SAMPSIZE = 10. (I would like about half of the units
> to be
> from each of the two strata -- in the population they are split
> 70%/30%).
>
> All was working fine until some of my selected PSUs contained fewer
> than 10
> units of UNIT_TYPE=2;
> When this occurred, the procedure stopped and indicated:
>
> ERROR: The sample size, 10, is larger than the number of sampling
> units, 9.
>
> Is there some way to inform the procedure to select all (no matter
> the
> size) if the number of sampling units is less than the sample size?
>
> If not, does anyone have any ideas to work around this?
>
> Thank you!
>
> Laura
>
>
> Laura M. Stapleton
> Assistant Professor, Quantitative Methods
> Educational Psychology
> University of Texas at Austin
> 504 SZB
> Austin, TX 78712-1296
> phone: 512/471-0858
> fax: 512/471-1288
=====
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
__________________________________
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.
http://antispam.yahoo.com/tools
|