```========================================================================= Date: Thu, 13 Jul 2006 11:31:32 -0500 Reply-To: Anthony Babinec Sender: "SPSSX(r) Discussion" From: Anthony Babinec Subject: Re: A Distinctly Non-Normal Distribution In-Reply-To: <9E813CDD534BFC4391280E6C046D90A86E7C85@klondike.exch.ad.byu.edu> Content-Type: text/plain; charset="US-ASCII" Here are a couple general comments. While the normal distribution might be a useful assumed distribution for errors in regression, there is no reason to think that it is necessarily useful for summarizing all phenomena out there in the world. As you have described your data, they are counts. In other words, values are 1, 2, 3 etc., and not real values in some interval. Are you looking at consumption in some fixed unit of time - say week, month, year? Given some assumptions, there are distributions such as the poisson that might be appropriate. It also could be the case that what you are studying represents a mixture of types, say usage types (low, medium, high), though that may or may not be the case here. Pete Fader(Wharton) and Bruce Hardie(London Business School) have a nice course on probability models in marketing that is regularly given at AMA events. -----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Stevan Nielsen Sent: Thursday, July 13, 2006 10:12 AM To: SPSSX-L@LISTSERV.UGA.EDU Subject: A Distinctly Non-Normal Distribution Dear Colleagues, I have stumbled upon an interesting phenomenon: I have discovered that consumption of a valuable resource conforms to a very regular, reverse J-shaped distribution. The modal case in our large sample (N = 16,000) consumes one unit, the next most common case consumes two units, the next most common three units, the next most common four units -- and this is the median case, and so on. The average is at about 9.7 units, which falls between the 72nd and 73rd percentile in the distribution -- clearly NOT an indicator of central tendency. I used SPSS Curve Estimation to examine five functional relationships between units consumed and proportion of consumers in the sample, testing proportion of consumers in the sample as linear, logarithmic, inverse, quadratic, or cubic functions of number of units consumed. I found that the reciprocal model, estimating proportion of cases as the inverse of units consumed, was clearly the best solution, yielding a remarkable, and very reliable R2 = .966. All five models were reliable, but the next best was the logarithmic solution, with R2 = .539; worst was the linear model, with R2 = .102. These seems like a remarkably regular, quite predictable relationship. I've spent my career so enamored with normal distributions that I'm not sure what to make of this distribution. I have several questions for your consideration: Do any of you have experience with such functions? (I believe it would be correct to call this a decay functions.) Where are such functions most likely to occur in nature, commerce, epidemiology, genetics, healthcare, and so on? What complications arise when attempting to form statistical inferences where such population distributions are present? (We have other measurements for subjects in this distributions, measurements which are quite nicely normal in their distributions.) Your curious colleague, lars nielsen Stevan Lars Nielsen, Ph.D. Brigham Young University ```

Back to: Top of message | Previous page | Main SPSSX-L page