Date: Tue, 10 Jul 2007 05:07:09 0700
ReplyTo: Paige Miller <paige.miller@KODAK.COM>
Sender: "SAS(r) Discussion" <SASL@LISTSERV.UGA.EDU>
From: Paige Miller <paige.miller@KODAK.COM>
Organization: http://groups.google.com
Subject: Re: Can I do this multiplr regression?
InReplyTo: <BAY103F16165A33CAB1BFEDBF6FACB0050@phx.gbl>
ContentType: text/plain; charset="usascii"
On Jul 10, 2:28 am, davidlcass...@MSN.COM (David L Cassell) wrote:
> x...@LSU.EDU wrote:
>
> >A colleague asked me if he can do this multiple regression. He first
> >calculated differences from X1 and X2 and used it as the dependent variable
> >(Y). Then he wanted to run regression Y = X0 X1 X3 X4 where Y is the
> >difference between X1 and X2, and he wanted to put X1 as one predictor. X0,
> >X3, X4 are independent from Y.
>
> >Can this multiple regression be done? Which assumption is violated?
>
> This, in itself, does not violate any regression assumptions. Of course:
>
> [1] regression assumptions may be violated that have nothing to do with
> using Y=X1X2.
>
> and
>
> [2] there may be serious logical violations in doing this.
>
> #2 depends on the data, the meaning of the variables, the underlying
> theory involved, the scope of the data, etc. It may be entirely feasible
> to perform the regression, even though scientifically it could be a hideous
> nightmare. I think this is a subjectmatter problem more than a
> statistical one.
I would add that since Y is a function of X1 and other variables, then
putting X1 in the right hand side as a predictor is very problematic.
Specifically, Y may be highly correlated with X1. Now that in itself
isn't a problem. The problem is that if the large regression doesn't
improve the fit over the regression involving Y and only X1 and an
intercept, then the large regression is useless. It simply is
explaining the a priori correlation between Y and X1.

Paige Miller
paige\dot\miller \at\ kodak\dot\com
