Date: Wed, 30 Jul 2008 11:36:30 -0400
Reply-To: Gene Maguin <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Gene Maguin <firstname.lastname@example.org>
Subject: Re: How to draw a random sample for 30 cities, 50 retailers?
Content-Type: text/plain; charset="us-ascii"
>>I have a file that has 50,000 customers IDs, who belongs to 30 different
cities and buy products from 50 different retailers. We want to draw a
sample of only 5,000 customers (10%) for a survey but this sample should
reflect the similar proportion of cities and retailers as of population.
Suppose if city A has 5% market share in the population (50,000) then the
random sample (5,000) should also have 5% customers from city A. Also within
city A the customers of all retailers should be included in similar
proportion as of population. It is possible that some retailers might not
operate in some cities also.
I'll assume that cities are numbered 1-30 and retailers are numbered 1-50.
Sort cases by citystore.
Aggregate outfile=* mode=addvariables/break=citystore/count=nu.
Sort cases by citystore ranvar.
If (citystore eq lag(citystore)) seq=lag(seq)+1.
If (seq gt .10*count) seq=0.
Select if (seq ne 0).
* this should give you a sample of nearly exactly 5,000 with nearly exactly
a 10% sample of every city and store combination. See if this is what you
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command