LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2005, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 21 Oct 2005 21:42:17 -0700
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: Surveylogistic
In-Reply-To:  <200510201659.j9KGdm4H022544@malibu.cc.uga.edu>
Content-Type: text/plain; format=flowed

arlando20002000@YAHOO.COM wrote: >Could someone please give me some insight in how this survey analysis >dataset and code should be set up in SAS. I have a multistage >stratification design. > >*There are 30,000 surveys total. >*There are three regions (regions 1 2 and 3). >*Region 1 has a total of 10,000 surveys, Region 2 has 12,000, and region 3 >has 8,000. >*Region 1 has a selection weight of .35, Region 2 weight is .40, and >region 3 weight is .25. >*From the random selection in the above step we now have >region 1 = 3500 >region 2 = 4800 >region 3 = 2000 >* The regions are further stratified into states within that region. >region 1 : Oregon, Washington >region 2: CA, Nevada >region 3: Utah, Arizona >*Within region 1 Ore has 2200 surveys and Wash has 1300 >rgion 2: CA has 3000 surveys while Nev has 1800 >region 3: Utah has 1200 surveys while Ariz has 800 >*Within region 1 Ore has a selection weight of .6 and Wash has .4 >region 2: CA has weight .65 while Nev has .35 >Region 3: Utah has weight .5 while Ariz has .5 >* Using random sampling we now have >OR = 1320 Wash = 520 >CA = 1950 Nev = 630 >UT = 600 Ariz = 400 >*The states are further stratified into zip codes in the following manner: >OR - Zip1 (400) Zip2(500) Zip3(420) >Wash - Zip1(300) Zip2(220) >CA = zip1(800) zip2(500) zip3(500) zip4(150) >Nev - zip1(430) zip2(200) >UT - zip1(125) zip2(250) zip3(225) >Ariz - zip2(200) zip2(200) >* Weights are as follows: >OR - .3, .4, .3 respectively >Wash - .5, .5 >CA - .30, .20, ,25, .25 >Nev - .6, .4 >Ut - .33, .33, .33 >Ariz - .65, .35 > >* Finally we have >OR - Zip1 = 120 surveys, Zip2 = 200 surveys, zip3 = 126 surveys >Wash - zip1 = 150 surveys, zip2 = 110 surveys >CA - 240 surveys, 100, 125, 38 >Nev - zip1 = 258, Zip2 = 80 >Ut - zip1 = 42, zip2 = 83, zip3 = 75 >Ariz - zip1 = 130, zip2 = 70 > >******************************************************* >I am not fully sure if that last step (Zip code stratification), but I >assume that the code wouldn't need to be tweeked a lot to remove if it is >not the case. >********************************************************** > >So the question is how can this be done in SAS using the Data step and >Surveylogistic. I am also having problems setting up the dataset to read >in this information correctly.

Well, first of all, you don't really have a three-stage sample. What you really have is a simple stratified sample where the strata are the zipcodes within the states. That's all.

And you've already come up with the sample sizes for the zipcodes. Okay, I'm assuming your 'zip' is really the first three numbhers of the zipcode. So let's build them, sort by our 'zip3' values, and tell PROC SURVEYSELECT how many samples to pick in each zipcode group. (The following code is untested. Sorry. And I'm guessing your zipcodes are numeric instead of character.)

data revised / view=revised; set YourTargetPopulation; zipgroup = /* however you want to define your zipcode groups */ ; run;

/* now you need a data set which has STATE, your zipcode group ZIPGROUP, and the number of surveys, which has to be named _NSIZE_ and note the two underscores - I'll call this data set ZIPSIZES because I'm really original. */

proc sort data=revised; by state zipgroup; run; proc sort data=zipsizes; by state zipgroup; run;

proc surveyselect data=revised out=YourSample method=srs /* I'm guessing here */ seed=5968473 /* pick a random seed */ sampsize=zipsizes; strata state zipgroup; run;

And voila! You have a sample.

But you'll need to monitor the fieldwork closely and adjust the initial sampling weights accordingly. And you'll need to address missing values in records. And you'll need to analyze the data properly, using the survey analysis procs.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Is your PC infected? Get a FREE online computer virus scan from McAfeeŽ Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963


Back to: Top of message | Previous page | Main SAS-L page