**Date:** Wed, 6 Aug 1997 08:32:00 GMT
**Reply-To:** m.fahey@junk.unsw.edu.au
**Sender:** "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
**From:** Michael Fahey <solar.usnw.edu.au@ACSUSUN.ACSU.UNSW.EDU.AU>
**Organization:** University of New South Wales
**Subject:** Re: design matrix specification in proc iml
Yes, I'm replying to my own post and I wrote:

>I have data from a two way nested mixed design and would like to use proc
>iml to specify the model. In particular, it would be helpful if I could
>use SAS code to create the design matrices for the fixed and random effects
>from the raw data (or some transformation of it). This would be much easier
>than manually specifying the design matrices for a large dataset.

>For example, consider the following (simplified) data to illustrate the
>problem. To study a biomarker (Y) among people in different regions (R),
>2 or 3 subjects (S) were randomly sampled from 2 regions and 2 or 3
>measurements of the biomarker obtained from each individual. Regions is a
>fixed factor and subjects a random factor nested within regions. The data
>are unbalanced. The 13 observations have been entered into a database as
>follows:

>R S Y

>1 1 14
>1 1 14
>1 1 15
>1 2 12
>1 2 13
>2 1 17
>2 1 17
>2 2 16
>2 2 17
>2 2 17
>2 3 14
>2 3 16
>2 3 16

>Briefly, the 13x1 vector of observations is represented by the 3rd column
>in the array of data above. The 13x3 design matrix (X) for the fixed region
>effect requires a column of ones for the mean and two columns to specify the
>region effect. It would look like this:

>X:

>1 1 0
>1 1 0
>1 1 0
>1 1 0
>1 1 0
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1

>The 13x6 design matrix (Z) for the random nested effect due to subjects
>requires a row for each observation and a column for each possible
>combination of region.subject levels, including levels without data. The
>latter point is important and is illustrated by the 3rd column below, which
>contains only zeros, since there are only 2 subjects from region 1.

>Z:

>1 0 0 0 0 0
>1 0 0 0 0 0
>1 0 0 0 0 0
>0 1 0 0 0 0
>0 1 0 0 0 0
>0 0 0 1 0 0
>0 0 0 1 0 0
>0 0 0 0 1 0
>0 0 0 0 1 0
>0 0 0 0 1 0
>0 0 0 0 0 1
>0 0 0 0 0 1
>0 0 0 0 0 1

The following data step code creates variables that represent the above
matrices, is easily modified for larger matrices and can be read into
proc iml:

--

data design (keep=y x1-x3 z1-z6);
set regions;

* design matrix for fixed effects;

array x(3); /* X is a Nx(I+1) matrix. */

x1=1; /* A column of ones for the mean. */
do i=1 to 2;
x(i+1)=0;
if region=i then x(i+1)=1;
end;

* design matrix for random effects;

array z(2,3); /* Z is a Nx(I*J) matrix. */

do i=1 to 2;
do j=1 to 3;
z(i,j)=0;
if region=i and subj=j then z(i,j)=1;
end;
end;

proc print data=design;
var y x1-x3 z1-z6;
run;

--

Replying to your own post is a bit like sending yourself a letter to cheer
yourself up... Am I so down? :-)

--
Michael Fahey | Tel: +61-2-9828-6000
Epidemiology Unit | Fax: +61-2-9828-6012
Liverpool Hospital | Remove "junk." to reply to this message.