Date: Thu, 29 Sep 2005 10:10:42 0300
ReplyTo: Hector Maletta <hmaletta@fibertel.com.ar>
Sender: "SPSSX(r) Discussion" <SPSSXL@LISTSERV.UGA.EDU>
From: Hector Maletta <hmaletta@fibertel.com.ar>
Subject: Re: complex samples
InReplyTo: <433BDFD8.9040904@DrKendall.org>
ContentType: text/plain; charset="usascii"
Looking at the sample design described in the original post, I surmise it is
stratified without clustering, and furthermore, the selection of the final
units (schools) is done randomly and proportional to size (i.e., I suppose,
proportional to number of students per school). Therefore schools are
stratified by factors (i) area and (j) size bracket. According to this:
1. If no clustering was present, the complex samples module is not
necessary. Only weights based on the reciprocal of sampling ratios.
2. Variables observed are at school level (such as funding, etc.) and not at
student level. Sampling proportional to size means that bigger schools have
more probability of being selected than smaller schools. Therefore sampling
ratios are different for different size brackets, as they probably are
different for different zones.
3. Weights in this case are basically defined as number of schools in size
bracket A existing in a certain stratum, divided number of schools selected
in the same stratum. These weights would correct for different sampling
ratios in different areas and sizes (proportionate effect) and will also
yield totals inflated to population size (expansion effect).
4. Schools selected but failing to provide data may be or may be not
considered in the denomintor of those weights, depending on some information
or assumption regarding their characteristics. If failures appear to be
unrelated to other variables (other than location and size), you may assume
they were random and therefore define weights as total schools existing in
the bracket and area, divided by schools ACTUALLY RESPONDING in the same
bracket and area. Results will expand to total schools in the stratum.
Otherwise you may divide by number of schools SELECTED, but the results will
expand only to schools actually responding.
5. The weights described above are inflationary. SPSS would compute
significance tests based on the WEIGHTED number of cases, which in this
situation would be equivalent to total number of schools existing in the
area and bracket. This overstates sample size, and overestimates
significance. Therefore, weights should be corrected, so that they account
only for proportionality but leave the scale of the sample untouched. This
is done by multiplying the above weights by nij/Nij, where n=sample size,
N=total number of schools, i=area and j=school size bracket. The same
obtains if the weights are defined on the basis of proportions, not total
numbers. For this, define:
Pij = Nij/N = Schools existing in stratum ij divided total number of schools
existing in the area covered by the study
Pij = nij/n = Schools selected or actually responding in stratum ij divided
total sample size
The new factors are defined as Pij/pij. These factors correct for
proportionality but not for scale, and would give correct figures for
significance tests.
6. If this type of sample is done with a constant selection probability in
all areas and size brackets, all the weights would be the same (except
perhaps for different numbers of nonresponding schools). If so, no
weighting is necessary. Unweighted significance tests would reflect the
significance of findings without bias.
Hope this helps.
Hector
> Original Message
> From: SPSSX(r) Discussion [mailto:SPSSXL@LISTSERV.UGA.EDU]
> On Behalf Of Art Kendall
> Sent: Thursday, September 29, 2005 9:37 AM
> To: SPSSXL@LISTSERV.UGA.EDU
> Subject: Re: complex samples
>
> The wider estimates I was referring to were for POP totals,
> means, percentages, etc. As long as you use the correct
> weights, the point estimates will be right. The interval
> estimates will be off. Of course, if the proportion of
> nonresponse is not the same in all cells, you need to
> further adjust the weights.
>
> One big danger with things like crosstabs, is talking about
> observed percentages as if they were different when the
> difference is readily attributed to the vagaries of sampling.
> It is crucial that some test find the difference to be
> inconsistent with sampling error before reporting them as
> different. For any interpretation of results in terms of
> policy, practice, etc., comparisons of point estimates that
> do not come up as different should be treated as if the were
> the same. For example, if 25% of the schools in the South
> use red pencils on official forms, and 10% of the schools in
> the West use red pencils on official forms, the difference in
> those percentages may very well be consistent with what one
> might observe simply due to sampling error. If the point
> estimate differences are statistically significant, using the
> error terms as if you had a simple random sample (i.e.,
> ignoring the degree to which the stratification variables
> account for some of the variance), you will find fewer
> differences as significant. You will be losing power. But
> those differences will have plausibility. Of course, report
> that the error estimates are inflated.
>
> If this is a one time study, you might be able to find another NGO,
> university, etc., that has the complex sampling module. Run all of
> your runs on your base SPSS using the weights. Once
> everything is ready, send the system file to the other agency.
>
> Check the SPSS site. You might also be able to a trial
> version that you should be able to use in 30 days if
> everything is ready to go. Also check whether the
> educational price is that high and whether the agency
> qualifies as educational.
>
> Art
> Art@DrKendall.org
> Social Research Consultants
> University Park, MD USA Inside the Washington, DC beltway.
> (301) 8645570
>
>
> russell wrote:
>
> >Art,
> >
> >Thank you for the response. I believe that their researchers can
> >probably live with wider estimates, even though the results of the
> >study are not yet available. Furthermore, most of the reported stats
> >will be in the form of descriptive tables (cross tabs
> mostly). They are
> >now only beginning to tackle inferential research etc. Russell
> >
> >Original Message
> >From: Art Kendall [mailto:Art@DrKendall.org]
> >Sent: 28 September 2005 04:31
> >To: russell
> >Cc: SPSSXL@LISTSERV.UGA.EDU
> >Subject: Re: complex samples
> >
> >If you do not use the complex samples module, given that all of the
> >design factors are stratifications (fixed effects), the
> error estimates
> >(confidence intervals) for the whole pop will be wider than
> necessary.
> >If the obtained intervals using the base are sufficiently
> narrow that
> >you can live with them, then you might forego using the complex
> >samples. If you intend to compare and contrast sets of
> cells, then you
> >would be better off using the smaller error terms from
> complex samples.
> >
> >"The size of the sample is partly
> >based on cost considerations, logistics etc."
> >
> >How are you gathering your data? By interview of by
> paperandpencil
> >reports by the schools?
> >
> >Keep in mind that the total cost of a survey is NOT a simple direct
> >effect of the sample size, especially in phone or mail surveys. A
> >great deal of the cost is in instrument development, results
> reporting, etc.
> >
> >
> >"The NGO claims that dividing the
> >schools into strata means that the sample is more likely to be
> >representative as you can ensure that each of the strata is
> represented
> >proportionally within the sample."
> >
> >Proportional representation is important for ease in calculation of
> >precision of pop estimates. It also helps plausibility of
> the design.
> >For comparing and contrasting strata etc., equal cell sizes yield
> >smaller error estimates.
> >
> >Art
> >Art@DrKendall.org
> >Social Research Consultants
> >University Park, MD USA Inside the Washington, DC beltway.
> >(301) 8645570
> >
> >
> >
> >russell wrote:
> >
> >
> >
> >>Hi there,
> >>
> >>I am asking this question on behalf of an NGO that surveys primary
> >>schools. They look at service delivery issues, leakage of funding,
> >>corruption etc. From approximately 5000 primary schools,
> the sampling
> >>frame is ten per cent (or 500 schools). The size of the sample is
> >>
> >>
> >partly
> >
> >
> >>based on cost considerations, logistics etc. The 500 schools were
> >>selected through a random, 2 stage probability
> proportionate to size
> >>selection process based on the number of schools. The schools were
> >>divided according to regions: Northern region, Central Region and
> >>Southern region. Then the schools in the region were
> further divided
> >>into rural and urban schools. Finally, the schools were
> proportionally
> >>selected through random sampling. The NGO claims that dividing the
> >>schools into strata means that the sample is more likely to be
> >>representative as you can ensure that each of the strata is
> >>represented proportionally within the sample.
> >>
> >>Does analysis of this data require the complex samples
> module in SPSS?
> >>
> >>Any thoughts are appreciated,
> >>Russell
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
> >
>
> __________ Informacisn de NOD32 1.1236 (20050928) __________
>
> Este mensaje ha sido analizado con NOD32 Antivirus System
> http://www.nod32.com
>
>
