LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 1999, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 5 Aug 1999 12:32:35 -0400
Reply-To:     Peter Flom <peter.flom@NDRI.ORG>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Flom <peter.flom@NDRI.ORG>
Subject:      Re: Real stats on real big data?
Content-Type: text/plain; charset=US-ASCII

>>> "Berryhill, Tim" <TWB2@PGE.COM> 08/05/99 11:50AM >>> wrote >>>Let me start by saying I haven't been paid for statistical work for 15 years >>>or more, so take this with some skepticism.

>>>3) A curious thing I have noticed with large datasets, which perhaps argues >>>in favor of samples, is that with 20 M obs every difference is significant. >>>I expect this is based on my incorrect application of statistics--assuming a >>>distribution is normal when in fact there is a minimum and such. It wasn't >>>a problem back in Oregon when we had N's of 17 or 288.

My reply

It is true that every thing is significant with really large N, but this is not because of any incorrect application of statistcs, it is inherent in the process. As you get larger N, you get more precise estimates of the population, so you are able to detect smaller effects

So, if you are doing (say) a t-test, you will be able to detect very small differences between means. Since the means of the two populations are never EXACTLY equal, with large enough N you will always find a difference between two samples. Whether that difference is meaningful for any practical purpose is another matter.

Peter Flom, Ph.D. Principal Research Associate NDRI 2 World Trade Center 16th floor New York, NY 10048

(212) 845-4485 (voice) (212) 845-4698 (fax) Peter.Flom@ndri.org


Back to: Top of message | Previous page | Main SAS-L page