```Date: Thu, 29 Sep 2005 10:10:42 -0300 Reply-To: Hector Maletta Sender: "SPSSX(r) Discussion" From: Hector Maletta Subject: Re: complex samples Comments: To: Art@DrKendall.org In-Reply-To: <433BDFD8.9040904@DrKendall.org> Content-Type: text/plain; charset="us-ascii" Looking at the sample design described in the original post, I surmise it is stratified without clustering, and furthermore, the selection of the final units (schools) is done randomly and proportional to size (i.e., I suppose, proportional to number of students per school). Therefore schools are stratified by factors (i) area and (j) size bracket. According to this: 1. If no clustering was present, the complex samples module is not necessary. Only weights based on the reciprocal of sampling ratios. 2. Variables observed are at school level (such as funding, etc.) and not at student level. Sampling proportional to size means that bigger schools have more probability of being selected than smaller schools. Therefore sampling ratios are different for different size brackets, as they probably are different for different zones. 3. Weights in this case are basically defined as number of schools in size bracket A existing in a certain stratum, divided number of schools selected in the same stratum. These weights would correct for different sampling ratios in different areas and sizes (proportionate effect) and will also yield totals inflated to population size (expansion effect). 4. Schools selected but failing to provide data may be or may be not considered in the denomintor of those weights, depending on some information or assumption regarding their characteristics. If failures appear to be unrelated to other variables (other than location and size), you may assume they were random and therefore define weights as total schools existing in the bracket and area, divided by schools ACTUALLY RESPONDING in the same bracket and area. Results will expand to total schools in the stratum. Otherwise you may divide by number of schools SELECTED, but the results will expand only to schools actually responding. 5. The weights described above are inflationary. SPSS would compute significance tests based on the WEIGHTED number of cases, which in this situation would be equivalent to total number of schools existing in the area and bracket. This overstates sample size, and overestimates significance. Therefore, weights should be corrected, so that they account only for proportionality but leave the scale of the sample untouched. This is done by multiplying the above weights by nij/Nij, where n=sample size, N=total number of schools, i=area and j=school size bracket. The same obtains if the weights are defined on the basis of proportions, not total numbers. For this, define: Pij = Nij/N = Schools existing in stratum ij divided total number of schools existing in the area covered by the study Pij = nij/n = Schools selected or actually responding in stratum ij divided total sample size The new factors are defined as Pij/pij. These factors correct for proportionality but not for scale, and would give correct figures for significance tests. 6. If this type of sample is done with a constant selection probability in all areas and size brackets, all the weights would be the same (except perhaps for different numbers of non-responding schools). If so, no weighting is necessary. Unweighted significance tests would reflect the significance of findings without bias. Hope this helps. Hector > -----Original Message----- > From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] > On Behalf Of Art Kendall > Sent: Thursday, September 29, 2005 9:37 AM > To: SPSSX-L@LISTSERV.UGA.EDU > Subject: Re: complex samples > > The wider estimates I was referring to were for POP totals, > means, percentages, etc. As long as you use the correct > weights, the point estimates will be right. The interval > estimates will be off. Of course, if the proportion of > non-response is not the same in all cells, you need to > further adjust the weights. > > One big danger with things like crosstabs, is talking about > observed percentages as if they were different when the > difference is readily attributed to the vagaries of sampling. > It is crucial that some test find the difference to be > inconsistent with sampling error before reporting them as > different. For any interpretation of results in terms of > policy, practice, etc., comparisons of point estimates that > do not come up as different should be treated as if the were > the same. For example, if 25% of the schools in the South > use red pencils on official forms, and 10% of the schools in > the West use red pencils on official forms, the difference in > those percentages may very well be consistent with what one > might observe simply due to sampling error. If the point > estimate differences are statistically significant, using the > error terms as if you had a simple random sample (i.e., > ignoring the degree to which the stratification variables > account for some of the variance), you will find fewer > differences as significant. You will be losing power. But > those differences will have plausibility. Of course, report > that the error estimates are inflated. > > If this is a one time study, you might be able to find another NGO, > university, etc., that has the complex sampling module. Run all of > your runs on your base SPSS using the weights. Once > everything is ready, send the system file to the other agency. > > Check the SPSS site. You might also be able to a trial > version that you should be able to use in 30 days if > everything is ready to go. Also check whether the > educational price is that high and whether the agency > qualifies as educational. > > Art > Art@DrKendall.org > Social Research Consultants > University Park, MD USA Inside the Washington, DC beltway. > (301) 864-5570 > > > russell wrote: > > >Art, > > > >Thank you for the response. I believe that their researchers can > >probably live with wider estimates, even though the results of the > >study are not yet available. Furthermore, most of the reported stats > >will be in the form of descriptive tables (cross tabs > mostly). They are > >now only beginning to tackle inferential research etc. Russell > > > >-----Original Message----- > >From: Art Kendall [mailto:Art@DrKendall.org] > >Sent: 28 September 2005 04:31 > >To: russell > >Cc: SPSSX-L@LISTSERV.UGA.EDU > >Subject: Re: complex samples > > > >If you do not use the complex samples module, given that all of the > >design factors are stratifications (fixed effects), the > error estimates > >(confidence intervals) for the whole pop will be wider than > necessary. > >If the obtained intervals using the base are sufficiently > narrow that > >you can live with them, then you might forego using the complex > >samples. If you intend to compare and contrast sets of > cells, then you > >would be better off using the smaller error terms from > complex samples. > > > >"The size of the sample is partly > >based on cost considerations, logistics etc." > > > >How are you gathering your data? By interview of by > paper-and-pencil > >reports by the schools? > > > >Keep in mind that the total cost of a survey is NOT a simple direct > >effect of the sample size, especially in phone or mail surveys. A > >great deal of the cost is in instrument development, results > reporting, etc. > > > > > >"The NGO claims that dividing the > >schools into strata means that the sample is more likely to be > >representative as you can ensure that each of the strata is > represented > >proportionally within the sample." > > > >Proportional representation is important for ease in calculation of > >precision of pop estimates. It also helps plausibility of > the design. > >For comparing and contrasting strata etc., equal cell sizes yield > >smaller error estimates. > > > >Art > >Art@DrKendall.org > >Social Research Consultants > >University Park, MD USA Inside the Washington, DC beltway. > >(301) 864-5570 > > > > > > > >russell wrote: > > > > > > > >>Hi there, > >> > >>I am asking this question on behalf of an NGO that surveys primary > >>schools. They look at service delivery issues, leakage of funding, > >>corruption etc. From approximately 5000 primary schools, > the sampling > >>frame is ten per cent (or 500 schools). The size of the sample is > >> > >> > >partly > > > > > >>based on cost considerations, logistics etc. The 500 schools were > >>selected through a random, 2 stage probability > proportionate to size > >>selection process based on the number of schools. The schools were > >>divided according to regions: Northern region, Central Region and > >>Southern region. Then the schools in the region were > further divided > >>into rural and urban schools. Finally, the schools were > proportionally > >>selected through random sampling. The NGO claims that dividing the > >>schools into strata means that the sample is more likely to be > >>representative as you can ensure that each of the strata is > >>represented proportionally within the sample. > >> > >>Does analysis of this data require the complex samples > module in SPSS? > >> > >>Any thoughts are appreciated, > >>Russell > >> > >> > >> > >> > >> > >> > > > > > > > > > > > > __________ Informacisn de NOD32 1.1236 (20050928) __________ > > Este mensaje ha sido analizado con NOD32 Antivirus System > http://www.nod32.com > > ```

Back to: Top of message | Previous page | Main SPSSX-L page