Date: Wed, 25 Apr 2007 19:32:39 0700
ReplyTo: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SASL@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: about weighted leastsquares estimation
InReplyTo: <1177512597.671577.248740@n35g2000prd.googlegroups.com>
ContentType: text/plain; format=flowed
shiling99@YAHOO.COM wrote:
>On Apr 24, 3:33 am, harry <xuqiyuan1...@gmail.com> wrote:
> > The experiment studied the effect of speed(x1), pressure (x2), and
> > distance (x3) on a printing machine's ability to apply coloring inks
> > on package labels. The following table summarizes the experimental
> > results.
> > i X1 x2 x3 yi1 yi2 yi3 average of yi
> si
> > 1 1 1 1 34 10 28 24 12.5
> > 2 0 1 1 115 116 130 120.3 8.4
> > 3 1 1 1 192 186 263 213.7 42.8
> > 4 1 0 1 82 88 88 86 3.7
> > 5 0 0 1 44 178 188 136.7 80.4
> > 6 1 0 1 322 350 350 340.7 16.2
> > 7 1 1 1 141 110 86 112.3 27.6
> > 8 0 1 1 259 251 259 256.3 4.6
> > 9 1 1 0 290 280 245 271.7 23.6
> > 10 1 1 0 81 81 81 81 0
> > 11 0 1 0 90 122 93 101.7 17.7
> > 12 1 1 0 319 376 376 357 32.9
> > 13 1 0 0 180 180 154 171.3 15
> > 14 0 0 0 372 372 372 372 0
> > 15 1 0 0 541 568 396 501.7 92.5
> > 16 1 1 0 288 192 312 264 63.5
> > 17 0 1 0 432 336 513 427 88.6
> > 18 1 1 0 713 725 754 730.7 21.1
> > 19 1 1 1 364 99 199 220.7 133.8
> > 20 0 1 1 232 221 266 239.7 23.5
> > 21 1 1 1 408 415 443 422 18.5
> > 22 1 0 1 182 233 182 199 29.4
> > 23 0 0 1 507 515 434 485.3 44.6
> > 24 1 0 1 846 535 640 673.7 158.2
> > 25 1 1 1 236 126 168 176.7 55.5
> > 26 0 1 1 660 440 403 501 138.9
> > 27 1 1 1 878 991 1161 1010 142.5
> >
> > Since the data has Heteroscedasticity, weighted ordinary least
> > squares estimation is prefered to construct a Linear Regression model.
> > But maybe use the sample variances as the basis for weighted least
> > squares estimation is not the best.
> > How to fit a linear model to an appropriate transformation of the
> > sample variances and thus to develop a more appropriate weights?
>
>In a regression model as,
> y=a+bx+err for i=1,2,3,...,n
>
>The heteroscedasticity is defined for err if it exists. If your
>regression residual=y(ahat+bhat*x) having heteroscedasticity, then
>GLM should be applied. It is conceptually incorrect because you judge
>it from ys in your data.
>
>It seems to me that you have repeated measures in your data as given
>
> x1 =1,x2=1,x3=1
>
>you have three measures of y ( 34 10 28 ).
>
>If you believe that each of these three are from the same
>distribution, for different xs are from different distribution. Then
>proc mix with repeated statement should work for you.
>
>If you really want weight ols, then proc model will do. You should
>look for feasibel generalized least square(FGLS) in literature.
>
>BTW your data may have problems in i=10,14. It will have a perfect fit
>for that groups / no variations.
>
>Here is a similation data and sas pgm.
>
>HTH
>
>data t1;
> do x=1 to 3;
> i=x;
> do j=1 to 50;
> y=5+2*x+x*rannor(3450);
> output;
> end;
> end;
>run;
>
>proc reg data=t1;
> model y=x;
>run;
>quit;
>
>
>data t2;
> set t1(where=(i=1) rename=(y=y1 x=x1));
> set t1(where=(i=2) rename=(y=y2 x=x2));
> set t1(where=(i=3) rename=(y=y3 x=x3));
>run;
>
>proc mixed data=t1;
> model y = x / s;
> repeated / grp=x r=13;
>run;
>
>proc model data=t2;
> y1=a+b*x1;
> y2=a+b*x2;
> y3=a+b*x3;
> fit y1 y2 y3 /fiml sur ;
> run;
> quit;
You make some good points here. But I'm interpreting the request
differently from you. (Note that I may be wrong here.)
I've seen data like this in SQC (Statistical Quality Control) before, and
in that case we're looking at 3 independent observations at each level
of a factorial design. So we don't have to worry (well, not a lot) about
the repeated measures issue. Unless the experiment was done badly.
Which happens way too often.
Of course, there's still a major SQC problem here, in that the need may
not be to model Y, but to model the response surface and find out
where Y is a max, or a min, or most stable, or least variable, or a
combination of some of these. Since we didn't get a decent answer on
*that* part, we may never know what the teacher was actually asking for.
David

David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330
_________________________________________________________________
Interest Rates NEAR 39yr LOWS! $430,000 Mortgage for $1,299/mo  Calculate
new payment
http://www.lowermybills.com/lre/index.jsp?sourceid=lmb963219132&moid=14888
