LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 9 Dec 2006 15:30:17 -0500
Reply-To:     Richard Ristow <wrristow@mindspring.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Richard Ristow <wrristow@mindspring.com>
Subject:      Re: guessing mean of bounded variable with 1:30 sampling ratio
Comments: To: nicola.baldini2@unibo.it
Comments: cc: Statisticsdoc <statisticsdoc@COX.NET>
In-Reply-To:  <7.0.0.16.2.20061201115818.021fb5e0@unibo.it>
Content-Type: text/plain; charset="us-ascii"; format=flowed

At 06:11 AM 12/8/2006, Nicola Baldini asked:

>I have a population of N=12000. I want to know the mean (and possibly >the standard deviation) of a variable x, bounded between 1 and 7. I >took a (let's suppose random) sample of n=400 and estimated mean = >3.14 (standard error = .15) and standard deviation = 2.28. Can I trust >such estimates?

To which, at 10:41 AM 12/9/2006, Stephen Brand replied:

>You have the Central Limit Theorem working for you here. Even though >the >distribution of individual cases is not normal, the distribution of >sample >means (with a sample size of 400) will approximate the normal >distribution >and should provide you with a reasonable estimate of the population >mean and the standard error of the means of samples of 400 cases.

To which I'll add, the Central Limit Theorem has an important ally here. Because your population mean and standard deviation are bounded (1<=mean<7; 0<=SD<=2.5, if my arithmetic's right*), convergence should be rapid, plenty good enough with n=400. THAT's not your problem.

Here's your problem: "I took a (let's suppose random) sample of n=400." Nope; no supposing. The arguments using the Law of Large Numbers and Central Limit Theorem only apply if the sample is random. You need to have a decent argument that you have a random sample, or at least that your sampling distribution is independent of the variable x.

You wave a big red flag: "I need to state formally that, despite a ridiculous response rate, my research is not that bad." 'Response rate': your 400 are respondents to a survey? How many did you survey - all 12,000? If so, there's no practical chance that a 3% response rate is a random sample of the population, not even approximately.

If you sampled a fraction, selected randomly, and had a higher response rate within that fraction, you may have a good argument. Otherwise, I'm afraid not likely.


Back to: Top of message | Previous page | Main SPSSX-L page