LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2007, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 29 Apr 2007 22:53:03 -0700
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: Central Limit Theorem
In-Reply-To:  <923870.50745.qm@web32210.mail.mud.yahoo.com>
Content-Type: text/plain; format=flowed

stringplayer_2@YAHOO.COM expertly replied: > >--- Peter Flom <peterflomconsulting@MINDSPRING.COM> wrote: > > > I am reviewing and editing a statistics book. > > > > In it, they have the following statement re the central limit theorem > > > > "Regardless of the shape of the population, if a sufficiently large > > random sample of size n is taken from the population, then the sample > > is approximately normally distributed, with mean mu sub xbar and > > standard deviation sigma/sqrt(n)" > > > > > > ???? > > > > This seems completely wrong! > > > > The CLM is not about ONE sample, but about MANY samples. That is, it > > should be > > > > "Regardless of the shape of the population, if a sufficiently large > > NUMBER of samples of a particular size is taken, then the > > distribution of the mean of the SAMPLES approaches normal as the > > NUMBER of samples approaches infinity" > > > > I googled a bit, and did not find a really good clear statement of > > this - the book is intended for HS students. > > > > Anyone got any suggestions? Am I all wrong? Is the author all wrong? > > Are we BOTH wrong? > > > > Happy weekend! > > > > Peter > > > >Good catch, Peter. The authors are being very loose with their >presentation here. > >Of course, it is the sample mean (not the sample) which converges >to a normal distribution under the CLT. But even for the CLT to >hold, we must have finite mean and variance. Try applying the >CLT to a Cauchy distribution! The sample mean ain't gonna converge >to a normal distribution in the entire time of God's existence! > >From my copy of Hogg and Tanis, the CLT can be stated as: > >Let Xbar{n} be the mean of a random sample X{1}, X{2}, ..., X{n} >of size n from a distribution with a finite mean mu and a finite >positive variance sigma^2. Then the distribution of > > W{n} = (Xbar{n} - mu) / (sigma/sqrt(n)) > > = (sum from i=1 to n of X{i} - n*mu) / (sqrt(n)*sigma) > >is N(0,1) in the limit as n-->infinity. > > >Dale

[1] Dale is right, as always.

[2] I'm guessing that this was just a sloppy re-statement of the CLT, trying to peg the reading level of the intended target audience. I recommend a proper statement of the theorem, plus some more explanation so it makes sense to the layperson.

[3] And, of course, there's more than one CLT. Often, CLT refers to a result that asserts that if you have a realization {X1, ...} where Xi meets certain rules, and you define Sn as the sum of X1 up to Xn, then there is going to be a sequence of norming constants an and bn such that (Sn - an)/bn converges in distribution (where 'converges in distribution' really is a technical term) to the standard normal. If you care, this double array of X's (K_n X's for each n>=1 where K_n goes to infinity as n does) has to be holospoudic in the sense of Wigodsky as a necessary and sufficient condition for convergence to that normal.

So what the h3!! does that mean? Well, it means that the variables do not have to have the same mean and variance, or even the same underlying distribution in order to get this convergence! Basically, as long as there is no random variate that by itself can blow the whole sum off towards infinity, then we get this kind of convergence.

Of course, no one is promising how far out we have to go before we get a decent fit to normality under these weirder conditions. n=30 wouldn't do for osme of them...

HTCT, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Mortgage rates near historic lows. Refinance $200,000 loan for as low as $771/month* https://www2.nextag.com/goto.jsp?product=100000035&url=%2fst.jsp&tm=y&search=mortgage_text_links_88_h27f8&disc=y&vers=689&s=4056&p=5117


Back to: Top of message | Previous page | Main SAS-L page