Date: Wed, 30 Jul 2008 11:36:30 -0400 Reply-To: Gene Maguin Sender: "SPSSX(r) Discussion" From: Gene Maguin Subject: Re: How to draw a random sample for 30 cities, 50 retailers? In-Reply-To: Content-Type: text/plain; charset="us-ascii" Sanjay, >>I have a file that has 50,000 customers IDs, who belongs to 30 different cities and buy products from 50 different retailers. We want to draw a sample of only 5,000 customers (10%) for a survey but this sample should reflect the similar proportion of cities and retailers as of population. Suppose if city A has 5% market share in the population (50,000) then the random sample (5,000) should also have 5% customers from city A. Also within city A the customers of all retailers should be included in similar proportion as of population. It is possible that some retailers might not operate in some cities also. I'll assume that cities are numbered 1-30 and retailers are numbered 1-50. Compute citystore=city*100+retailer. Sort cases by citystore. Aggregate outfile=* mode=addvariables/break=citystore/count=nu. Compute ranvar=uniform(1). Sort cases by citystore ranvar. Compute seq=1. If (citystore eq lag(citystore)) seq=lag(seq)+1. If (seq gt .10*count) seq=0. Select if (seq ne 0). Execute. * this should give you a sample of nearly exactly 5,000 with nearly exactly a 10% sample of every city and store combination. See if this is what you need. Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

