LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2005)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 12 Sep 2005 14:59:38 -0400
Reply-To:     Derek Wilkinson <dwilkinson@laurentian.ca>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Derek Wilkinson <dwilkinson@laurentian.ca>
Subject:      Re: data transformation bibliografical sources
Comments: To: Hector Maletta <hmaletta@fibertel.com.ar>
In-Reply-To:  <S163921AbVIKSae/20050911183044Z+165597@avas-mr07.fibertel.com.ar>
Content-Type: text/plain; charset="iso-8859-1"

Jorge and Hector:

The most original and pervasive account is that by John Tukey entitled Exploratory Data Analysis. The three-volume pre-publication version had a lot more than the final version published under that title. Some was published in Mosteller & Tukey, Data Analysis and Regression: A second course. EDA is a very quirky book but brilliant and if you are experienced, you will find real gems therein. Of the general stats books, John Fox has a very good treatment of transformations, but his is pretty mathematical. Bonnie Erikson has a more introductory version. I need to disagree with two comments from Hector. For much of social science there is no a priori meaningful scale so often transformed variables (if they are increasing transformations) may have as much or more legitimacy as the original. This is particularly true with income. How could Jorge have the same error in his calculated income as Bill Gates does in his income? Errors and misestimates are obviously related to size, ergo the necessity of logging. Second, there isn't always the possibility of finding an abstruse mathematical formula (unless it's stochastic) to create normality. I have had students (albeit without much background in math) try to transform gender (M or F) into a normally distributed and symmetric variable. Square roots and logarithms didn't work! Neither did anything else. Cheers. Derek PS Samuel Leonhardt did a didactic workshop at an American Sociology Association meeting twenty-some years ago and lucidly introduced me and all others who attended into the virtues of Exploratory Data Analysis and the insights of John Tukey.

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Hector Maletta Sent: Sunday, September 11, 2005 2:30 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: data transformation bibliografical sources

Jorge, Normality and homogeneous variance are possible attributes of your data, and they may or may not have them. No data transformation by itself will give them what they do not have.

You can of course transform your variables into something else that is more similar to what you desire (e.g. the logarithm of a variable may have a distribution that looks more "normal" than the original variable), and there is always the possibility of finding a mathematical formula, however abstruse, able to achieve that. But on scientific terms this would be meaningless unless you have a theory whereby your variable behaves in ways related to that particular mathematical function. For instance, if people react more to the PROPORTION their incomes grow, than the AMOUNT of the increase, and thus an additional $1000 means different things to a billionnaire or to you and me, then the logarithm of income may find a place in your analysis, because a certain difference in logarithms means a certain proportional difference in the original variable. If you do not have theory or evidence of this kind, using logarithms has as much sense as using, say, the cosine or the cubic root or a 17th degree polynomial of your variable.

Besides, remember previous caveats in this forum to the effect that it is not variables, but errors of estimation, that have to be normal, with homogeneous variances, for standard statistical models (like regression) to apply.

Hector

> -----Original Message----- > From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] > On Behalf Of Jorge Camacho > Sent: Sunday, September 11, 2005 2:03 PM > To: SPSSX-L@LISTSERV.UGA.EDU > Subject: data transformation bibliografical sources > > Dear All: > > I am loking for a good review or bibliografical source (in > electronic format if possible) about data transformation in > order to reach normallity, homogeneous variances etc. Most > text books have very few pages on this. I would appreciate > any supportt on this. > > Thanks in davance. > > Jorge > > -- > @@@@@@@@@@@@@@@@@@@@@@@@@@@@ > Jorge Camacho Sandoval, Ph. D. > Bioestadística - Mejora Genética Animal > P. O. Box 1960 - 4050, Alajuela, Costa Rica Tel. (506)4410487 > Fax. (506)4400575 > e-mail: jcamacho@ice.co.cr or jorge.camacho.s@gmail.com > @@@@@@@@@@@@@@@@@@@@@@@@@@@@ > > __________ Información de NOD32 1.1213 (20050909) __________ > > Este mensaje ha sido analizado con NOD32 Antivirus System > http://www.nod32.com > >


Back to: Top of message | Previous page | Main SPSSX-L page