Not out of my head: I just remembered the piece of information but not the
precise source. Unfortunately I am now travelling and have little chance to
look in the books I suspect have the answer. However, it is quite common
knowledge that at least 30/50 cases are needed for a normal distribution to
take shape.
Hector
Original Message
From: Whanger, J. Mr. CTR [mailto:James.Whanger@med.navy.mil]
Sent: 01 October 2009 10:53
To: Hector Maletta; SPSSXL@LISTSERV.UGA.EDU
Subject: RE: Re: Multiple Linear Regression vs a series of simple linear
regression on the presence of multicollinearity
Hector,
Is there any chance you have a citation for the monte carlo experiments
you mentioned?
Thanks,
Jim
Original Message
From: SPSSX(r) Discussion [mailto:SPSSXL@LISTSERV.UGA.EDU] On Behalf Of
Hector Maletta
Sent: Wednesday, September 30, 2009 4:25 PM
To: SPSSXL@LISTSERV.UGA.EDU
Subject: Re: Multiple Linear Regression vs a series of simple linear
regression on the presence of multicollinearity
In addition to Bruce's comment:
1. In multiple regression, each coefficient tells you by how much the DV
changes for a unit change in one IV, keeping the other IV constant.
Since IVs are intercorrelated, it is no surprise that once you keep 99
of them constant, an increase of the 100th actually decreases the DV.
2. Having N=100 limits the number of IV you can use. The old rule of
thumb is that you should never attempt anything with less than 10 cases
per variable. You are above that threshold (5 predictors with 100 cases
= 20 cases per predictor), but even that threshold is far too low: ten
(or 20) cases per variable leave you with large margins of error. Linear
regression assumes that errors are normally distributed, but Monte Carlo
sampling experiments suggest that errors are likely to be not normally
distributed when sample size is less than 3050 cases (per variable).
This would imply that you cannot use more than 23 independent variables
with 100 cases. Of course the final result's significance would depend
also on the coefficient of variation of each variable (SD/mean), their
intercorrelation and other things, but those figures suggest you better
get a larger sample if you are attempting such a regression exercise.
Hector
Original Message
From: SPSSX(r) Discussion [mailto:SPSSXL@LISTSERV.UGA.EDU] On Behalf Of
Bruce Weaver
Sent: 30 September 2009 16:54
To: SPSSXL@LISTSERV.UGA.EDU
Subject: Re: Multiple Linear Regression vs a series of simple linear
regression on the presence of multicollinearity
eins wrote:
>
> I am conducting a multiple Linear regression with 5 predictors, all
> variables are continuous and n=100. Before doing linear regression
> analysis, I did first a simple correlation analysis and found that all
> the predictors have positive and significant correlation with the
> outcome variable. There are highly correlated predictors.
> Surprisingly, when I did the multiple linear regression, two of the
> predictors have negative B coefficients, Beta coeffcient less than
> 1.0, VIF of greater than 10, Eigenvalue of zero, condition index of
> >30.. These are indication of multicollinearity problem.
>
> Is it a right alternative to do simple linear regression, one
> predictor at a time, instead of multiple regression? In case this
> alternative is wrong, what makes it wrong? What information would be
> lost in doing a series of simple regression, rather than multiple
regression.
>
> Thank you.
> Eins
>
The negative coefficients for a couple variables suggests that you have
one more more "suppressor variables". If you Google on that term, you
should find lots of hits, including some notes by textbook author David
Howell.
Regarding your second question, if you run 5 simple linear regressions,
you'll have no control for confounding. The fact that you were running
a multiple regression model in the first place suggests that this is not
what you want. If the excessive multicollinearity is due to one
variable, I would try just removing it.


Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."
NOTE: My Hotmail account is not monitored regularly.
To send me an email, please use the address shown above.

View this message in context:
http://www.nabble.com/MultipleLinearRegressionvsaseriesofsimplel
inea
rregressiononthepresenceofmulticollinearitytp25678823p25687751.ht
ml
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
=====================
To manage your subscription to SPSSXL, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSXL), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSXL For a list
of commands to manage subscriptions, send the command INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSXL), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSXL For a list
of commands to manage subscriptions, send the command INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
