Date: Thu, 17 Jan 2008 14:29:15 -0600
Reply-To: "data _null_," <datanull@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "data _null_," <datanull@GMAIL.COM>
Subject: Re: Weird result in PROC GLM
In-Reply-To: <7407591.1200594296930.JavaMail.root@mswamui-chipeau.atl.sa.earthlink.net>
Content-Type: text/plain; charset=ISO-8859-1
Perhaps this example will help. The means are depend on the coding 0
vs -1. In the REG the intercepts are different. The intercept and
mean are related, yes?
data test;
do coding = -1,0;
do year = 1989 to 2007;
LateYear =ifn(year>1997,1,coding);
earlyYear=ifn(year<1998,1,coding);
output;
end;
end;
run;
proc print;
run;
proc means;
class coding;
run;
proc reg noprint outest=est;
by coding notsorted;
late:model year = lateYear;
early:model year = earlyYear;
run;
proc print;
run;
On Jan 17, 2008 12:24 PM, Peter Flom <peterflomconsulting@mindspring.com> wrote:
> Hello
>
> We have a data set with 3 variables: Year, percent HIV and research estimate.
>
> At the request of our boss, we converted year into a dichotomous variable (not my choice of analysis...)
>
> My colleague did this two ways:
>
> 1) Lateyear = 1 if year > 1997, else 0
> and
> 2) earlyyear = 1 if year < 1998, else 0
>
> as a learning exercise we then ran both
>
> PROC GLM;
> MODEL RESEARCH_ESTIMATE = PERCENT_HIV LATEYEAR PERCENT_HIV*LATEYEAR;
> RUN;
>
> and
>
> PROC GLM;
> MODEL RESEARCH_ESTIMATE = PERCENT_HIV EARLYYEAR PERCENT_HIV*EARLYYEAR;
> RUN;
>
> I expected these to be identical, except for change of sign. Indeed, most of the results are identical, R-square, type I SS, F value etc.. But the TYPE III SS results for PERCENT_HIV are different by a factor of 8; the parameter estimate for the intercept is different, and the parameter estimate for PERCENT_HIV is different by a small amount, and the SE for the intercept and PERCENT_HIV are different.
>
> Any ideas how this could happen?
>
> Thanks
>
> Peter
>
|