LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2005)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 28 Sep 2005 08:31:07 -0400
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Art Kendall <>
Organization: Social Research Consultants
Subject:      Re: Logarithmic transformation of not normal data
Comments: To: razan_mikwar@YAHOO.COM
In-Reply-To:  <S114778AbVIOLde/>
Content-Type: text/plain; charset=us-ascii; format=flowed

Depending on the number of cases you have and the subject matter area, a multiple correlation of .55 (r**2= .3) could be suspiciously high. What are your variables? how are they measured? How many cases do you have? How were they selected?

Art Social Research Consultants University Park, MD USA Inside the Washington, DC beltway. (301) 864-5570

Hector Maletta wrote:

>Razan, >see my comments below. >Hector > > > _____ > >From: Razan Mikwar [] >Sent: Thursday, September 15, 2005 2:30 AM >To: Hector Maletta >Subject: RE: Logarithmic transformation of not normal data > > >Hi Mr.Hector, > >First of all thank you very much for your quick response. >Secondly: >1-I don't want high correlation coefficient what I need to make it higher is >the coefficient of determination(R squre), and about residuals I've already >tested there normality and they are normal. > >R2 is the squared correlation coefficient, so both are essentially the same. >If residuals are normal, nothing is necessary to get more normal residuals >such as a log transformtion. > >2-I don't know what do you mean by the 2nd point but I've tested that there >is no correlation between independent variables i.e there is no >multicollinearity, and the scatter between the DV and each IV is not u >shaped. > >What I mean in my second point is that a low R or R2 may be due to either: >the absence of any relationship between your DV and the set of IV, or the >presence of a relationship that is not linear. This can be ascertained by >plotting predicted and observed values. A formless cloud is the first case, >a regular but not linear shape, e.g. a cloud in the shape of an U, is the >second case. In the latter situation you may transform some of the variables >to get a linear, instead of non-linear relationship, or you may try >non-linear regression or curve fitting. > >3 & 4- I'm trying hardly not to another model other than linear in order not >to test another assumptions that's why I'm trying to find a way to solve the >problem,Moreover Idon't know how to detect which model that would fit. > >Models are based on theory. Trying blindly anything that fits is not good >advice. > >5-As I mentioned before I've tested collinearity but there is only one >assumption that I wasn't able to test is that residuals and independent >variables are independent from each other because I don't have the residuals >as separated variable. > >Collinearity might have been one problem, but you evidently do not have it. >Perhaps it is simply that your IV do not predict the DV well. That happens. > > >Razan > > >Razan, > >1. Your variables do not need to be normally distributed in order to use >regression, and even less so in order to get high correlation coefficient. >You are confused by the fact that linear regression requires that residuals, >i.e. random errors of prediction (difference between predicted and observed >values) have a normal distribution both sides of the regression line. > >2. A low or near zero linear [multiple] correlation coefficient may be due >to (a) the absence of any systematic relationship between your IV and DV, or >(b) the existence of a relationship which is non linear. As an example of >(b), if your scatterplot shows a cloud of points with the shape of a U, >there would be possibly a quadratic relationship but the linear coefficient >may be zero. > >3. The method of least squares to estimate regression functions is based on >the assumption of a linear relationship between the variables involved. When >the relationship is not linear there are two ways to go: (i) identify the >non-linear function linking the variables, and transform it in some way that >yields a linear function, then apply least squares linear regression; or (b) >approximate a non linear function by means of non-linear regression or >curve-fitting, which do not use the least squares algorithm. Some non linear >functions are amenable to linearization, some are not. For instance, a >quadratic equation like y=a+bX+cX^2 can be linearized if you define a new >variable Z=X^2, and use the linear equation y=a+bX+cZ; likewise the equation >y=aX^b can be linearized by taking logarithms as log y=log a + b(log X). > >4. The fact that a certain mathematical function fits your data is no great >deal. You can always find some function that does that. The trick is finding >a function for which you have a theoretical explanation. So it is not >advisable to go around blindly trying different mathematical functions until >any of them "fits". In fact, you may find several, perhaps an infinite >number of functions that reasonably fit the data, and that is arguably worse >than not having any. > >5. If no reasonable function fits the shape of the data, perhaps your data >just show little relationship at all between the variables... > >Hector > > > > > >>-----Original Message----- >> >> > > > >>From: SPSSX(r) Discussion [ <mailto:SPSSX-L@LISTSERV.UGA.EDU> >> >> >mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf > > > >>Of Razan >> >> > > > >>Sent: Monday, September 12, 2005 11:04 PM >> >> > > > >>To: SPSSX-L@LISTSERV.UGA.EDU >> >> > > > >>Subject: Logarithmic transformation of not normal data >> >> > > > > > > >>Hi, >> >> > > > > > > >>I've made a multiple linear regression using SPSS by one dependent >> >> > > > >>variable and two indepent variables and all assumptions were satisfied >> >> > > > >>but R squre is very low about 0.3,so I think that is because my >> >> > > > >>variable are not normally distributed that's why I was thinking about >> >> > > > >>transforming my data uasing logarithmic transformation to normal >> >> > > > >>distributio and repeat the regression,but I don't know how to >> >> > > > >>transform them? >> >> > > > >>and do I have to test any other assumptions after applying the >> >> > > > >>transformation?] >> >> > > > > > > >>Thanks >> >> > > > >

Back to: Top of message | Previous page | Main SPSSX-L page