I've seen excellent advice in the two Replies so far.
Please, do read some of the literature on "suppressor variables".
In addition -- For the data on hand, consider which variables might
be acting to "suppress" the contribution of "Complications" (in this
instance, to actually reverse it). This can also be considered under
"confounding". If you find the source of confounding, you should
next try to re-score your predictor variables so that you do directly
measure the predictive influence of the logically-combined variables.
For instance -- If you were also using something like "Length
of Hospitalization" as a predictor, it could be that the people
who have the worst QOL on followup are the ones who had a
long hospital stay and did *not* have Complications that readily
explained it. Therefore, Complications enters with a minus sign.
By logical analysis, you might be able to break that Length of stay
into parts: (a) Expected (minimum), (b) Extra days, due to surgical
complications, and (c) Extra, due to non-surgical complications.
Also: The effect of (c) might be non-linear, such that having one,
two or three days could be increasingly bad, but having seven days
is not much worse than having three. - This sort of measurement
non-linearity is another source of apparent confounding, which should
be covered in some of the literature.
Date: Wed, 21 Sep 2011 14:30:53 +0200
Subject: output from linear regression
i have some troubles with understanding of output from
multivariate linear regression…
As predictors there are some 25 variables and on the other
side is dependent variable (from medical research),
which represents ‘quality of life” (between 0
and 100 points, more points implies more quality of life after operation).
I have chosen backward procedure, so after n steps remained
only some medical predictors with significant influence….
Now the problem: one have some predictors which should have obviously
negative influence on my dependent variable, which is ‘quality of life’,
such predictors for example are ‘surgery complication’
(0: no, 1 yes) OR ‘tumor length’ have indeed significant positive
like Beta = .253, p < .000 for ‘tumor length’.
It can’t be logical that people with big tumors have significantly better
‘quality of life’ after operation nor
with more surgical complications….(one should see
instead Beta = - .253 p < .000).
On the other side another predictor - variables gained negative
or positive significant influence, which could be logically well explained.
Could it be that by processing linear regression with
backward procedure are some intern steps,
which makes signs (if it is plus then empty, minus as ‘-‘)
for Beta – Values in an output irrelevant?
How else could this be explained?