LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 1999, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 9 Aug 1999 17:32:54 +0100
Reply-To:     John Whittington <medisci@POWERNET.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         John Whittington <medisci@POWERNET.COM>
Subject:      Re: Real stats on real big data?
Comments: To: "Berryhill, Tim" <TWB2@PGE.COM>
In-Reply-To:  <>
Content-Type: text/plain; charset="us-ascii"

At 18:00 06/08/99 -0700, Berryhill, Tim wrote:

>In the mapping example below, it might have been sufficient to sample one >point in each bin. Drawing a 1% sample from California might give you only >people in Los Angeles. You could easily miss entire counties.

Tim, stratified samples are obviously fine if one is interested in looking at some sort of 'characteristics' of what is in each 'bin', but the approach clearly can't be used if the purpose of the exercise is to estmate the *number* of items in each 'bin' - which, as far as I can make out, was what was wanted in this example.

I think the point you make above illustrates why any sort of sampling/data-reduction methods are probably inappropriate to the mapping exercise - since, unless one seeks only very 'coarse' information (i.e. very large bins), one will invariably 'chop off the bottom of the data' - and, as you say, could miss whole towns/counties.

Kind Regards John John

---------------------------------------------------------------- Dr John Whittington, Voice: +44 (0) 1296 730225 Mediscience Services Fax: +44 (0) 1296 738893 Twyford Manor, Twyford, E-mail: Buckingham MK18 4EL, UK ----------------------------------------------------------------

Back to: Top of message | Previous page | Main SAS-L page