Date:         Fri, 12 Jan 2007 16:38:48 -0500
Reply-To:     Kevin Roland Viel <kviel@EMORY.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Kevin Roland Viel <kviel@EMORY.EDU>
Subject:      Re: normality of residuals: opinions?
In-Reply-To:  <>
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Fri, 12 Jan 2007, Robin High wrote:

> "To transform or not to transform" has many implications -- esp. not > knowing the data or the objectives -- interpretation and how to > back-transform, among them. > > A LOG seems a bit extreme here; perhaps a square root would be another > choice. Of concern are the values of e around 70-80; perhaps they are > outliers that ROBUSTREG could be an alternative. And also the spike for e > around -10 -- is there a clustering of values say at a boundary point?


Too right. I should not have blind-sided the list like that. We measured the activity level of a plasma protein. The independent variable of interest is a score from an instrument. I expect that with a moderate sample size (200-500) that the activity level would be suitably normally distributed. As David points out, though, it is the distribution of the residuals and not of the DP that is important ( e~N(0,sigma).

But your point brings up another question. What IF I know that my residuals *are* normally distributed from many other investigations, but for my current sample, this was not the case. Obviously, failure to meet the assumptions could foul the model. Besides thoroughly investigating potential violations, what might one do?

BTW, most of the IV's are quantitative (age, BMI, another protein level) so any clustering is surprising, not that I conclude that it happened.

Thank you,


Kevin Viel PhD Candidate Department of Epidemiology Rollins School of Public Health Emory University Atlanta, GA 30322

