LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2007, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 12 Jan 2007 18:46:00 -0500
Reply-To:     Peter Flom <flom@NDRI.ORG>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Flom <flom@NDRI.ORG>
Subject:      Re: normality of residuals: opinions?
Comments: To: kviel@EMORY.EDU
Content-Type: text/plain; charset=US-ASCII

Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St http://cduhr.ndri.org www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax)

>>> Kevin Roland Viel <kviel@EMORY.EDU> 01/12/07 4:38 PM >>> wrote <<< Too right. I should not have blind-sided the list like that. We measured the activity level of a plasma protein. The independent variable of interest is a score from an instrument. I expect that with a moderate sample size (200-500) that the activity level would be suitably normally distributed. As David points out, though, it is the distribution of the residuals and not of the DP that is important ( e~N(0,sigma).

But your point brings up another question. What IF I know that my residuals *are* normally distributed from many other investigations, but for my current sample, this was not the case. Obviously, failure to meet the assumptions could foul the model. Besides thoroughly investigating potential violations, what might one do?

BTW, most of the IV's are quantitative (age, BMI, another protein level) so any clustering is surprising, not that I conclude that it happened. >>>

Kevin

First, although it may seem like picking nits, you cannot know that the residuals in YOUR data are normally distributed from the results of runs on OTHER data. I actually don't think it's picking nits at all. If there is something in your data that isn't present in other, similar samples then either 1) You got unlucky. Hey, it happens. Once in a while, a random sample will include some strange data. or 2) You've discovered something really itneresting. This is an 'geee......that's funny' moment, and that is the sort of moment that starts big discoceries.

Second, clustering is always possible with numerical data. How did you get BMI? If you asked people to tell their weight and height, then I would bet dollars to donuts that the data ARE clustered. A lot more people report (say) 180 pounds than 179 or 181.

How were protein levels recorded? (I have no idea how this is done, but if a human has to read some instrument, I bet there's clumping).

HTH

Peter


Back to: Top of message | Previous page | Main SAS-L page