Date: Wed, 6 Aug 1997 08:32:00 GMT
Reply-To: m.fahey@junk.unsw.edu.au
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Michael Fahey <solar.usnw.edu.au@ACSUSUN.ACSU.UNSW.EDU.AU>
Organization: University of New South Wales
Subject: Re: design matrix specification in proc iml
Yes, I'm replying to my own post and I wrote:
>I have data from a two way nested mixed design and would like to use proc
>iml to specify the model. In particular, it would be helpful if I could
>use SAS code to create the design matrices for the fixed and random effects
>from the raw data (or some transformation of it). This would be much easier
>than manually specifying the design matrices for a large dataset.
>For example, consider the following (simplified) data to illustrate the
>problem. To study a biomarker (Y) among people in different regions (R),
>2 or 3 subjects (S) were randomly sampled from 2 regions and 2 or 3
>measurements of the biomarker obtained from each individual. Regions is a
>fixed factor and subjects a random factor nested within regions. The data
>are unbalanced. The 13 observations have been entered into a database as
>follows:
>R S Y
>1 1 14
>1 1 14
>1 1 15
>1 2 12
>1 2 13
>2 1 17
>2 1 17
>2 2 16
>2 2 17
>2 2 17
>2 3 14
>2 3 16
>2 3 16
>Briefly, the 13x1 vector of observations is represented by the 3rd column
>in the array of data above. The 13x3 design matrix (X) for the fixed region
>effect requires a column of ones for the mean and two columns to specify the
>region effect. It would look like this:
>X:
>1 1 0
>1 1 0
>1 1 0
>1 1 0
>1 1 0
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>1 0 1
>The 13x6 design matrix (Z) for the random nested effect due to subjects
>requires a row for each observation and a column for each possible
>combination of region.subject levels, including levels without data. The
>latter point is important and is illustrated by the 3rd column below, which
>contains only zeros, since there are only 2 subjects from region 1.
>Z:
>1 0 0 0 0 0
>1 0 0 0 0 0
>1 0 0 0 0 0
>0 1 0 0 0 0
>0 1 0 0 0 0
>0 0 0 1 0 0
>0 0 0 1 0 0
>0 0 0 0 1 0
>0 0 0 0 1 0
>0 0 0 0 1 0
>0 0 0 0 0 1
>0 0 0 0 0 1
>0 0 0 0 0 1
The following data step code creates variables that represent the above
matrices, is easily modified for larger matrices and can be read into
proc iml:
--
data design (keep=y x1-x3 z1-z6);
set regions;
* design matrix for fixed effects;
array x(3); /* X is a Nx(I+1) matrix. */
x1=1; /* A column of ones for the mean. */
do i=1 to 2;
x(i+1)=0;
if region=i then x(i+1)=1;
end;
* design matrix for random effects;
array z(2,3); /* Z is a Nx(I*J) matrix. */
do i=1 to 2;
do j=1 to 3;
z(i,j)=0;
if region=i and subj=j then z(i,j)=1;
end;
end;
proc print data=design;
var y x1-x3 z1-z6;
run;
--
Replying to your own post is a bit like sending yourself a letter to cheer
yourself up... Am I so down? :-)
--
Michael Fahey | Tel: +61-2-9828-6000
Epidemiology Unit | Fax: +61-2-9828-6012
Liverpool Hospital | Remove "junk." to reply to this message.