Date: Fri, 20 Apr 2001 16:10:23 -0700
Reply-To: mradramz@WELLSFARGO.COM
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ramzi Mrad <mradramz@WELLSFARGO.COM>
Subject: multicollinearity: sometimes in doesn't matter
Content-Type: text/plain
Sorry for the belated comment on this thread, but I think its worth pointing
out that one does not need to be too worried about multicolinearity if the
purpose of his/her (logistic) regression is for predictive purposes.
Multicolineartiy can make the estimates of the coefs. from the model
unstable, but it will not affect the predictive ability of the model. So for
e.g., if you are developing a model and using it to score other data,
multicolinearity won't screw up your scores. If on othe other hand you are
interested in the estimates of the coefs., (epidemiologists calculating odds
rations for eg), you should be very worried about multicolinearity. Having
said that, if you have multicolinearity (sounds like a disease) and know
about it, you should of course try to get rid of it.
Cheers,
Ramzi Mrad
> -----Original Message-----
> From: Peter Flom [SMTP:peter.flom@NDRI.ORG]
> Sent: Wednesday, April 18, 2001 6:03 AM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: Collinearity vs. multicollinearity: an ignorant
> question
>
> Although some people distinguish them as Collinearity = two variables,
> Multicollinearity = more than two, I don't think this distinction is
> particularly useful (does it refer to the total number of independent
> variables? the number involved in collinearities? the number of
> collinearities? or what?, and most people (including Belsley) treat them
> as synonyms.
>
> And thanks for the compliment below.
>
> Peter
>
> >>> Philip Gallagher <PGall9898@AOL.COM> 04/18/01 08:53AM >>>
>
> What follows is not intended to be a smart-aleck
> comment; it stems from true ignorance on my part,
> ignorance that I would like to remedy.
>
> MULTICOLLINEARITY vs. COLLINEARITY
> (I dimly remember some comments about this, probably
> in the 1980s, possibly on SAS-L, possibly STAT-L. I
> didn't follow the thread at the time and am thus still
> ignorant.) When does one use the word "collinearity"
> and when "multicollinearity"? As I see it, if one has
> only two variables in mind one is obliged to speak
> only of collinearity, not multicollinearity. And, if one
> is considering more than two variables, one may speak
> of either collinearity or multicollinearity. (Which makes
> "multicollinearity" seem a bit redundant - hey, things
> are either co-linear or they aren't.)
>
> At one time I thought that "multicollinearity" was used
> primarily by the social scientists, but I now have the
> nagging feeling that I was naive about that. Naive,
> hell, wrong.
>
> BTW, I think Peter Flom's suggestion for using
> PROC REG to investigate collinearity even though
> the eventual analysis is to be logistic " ... Since
> collinearity is a relationship among the independent
> variables ..." is particular illuminating.
>
> Anyway, is there actually a hard and fast rule that
> tells you when you must use "multicollinearity"?
>
> Phil Gallagher
> Nantucket
|