Date: Tue, 11 Mar 2003 16:29:35 -0800
Reply-To: cassell.david@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: Power calculations for longitudinal models via simulation
Content-type: text/plain; charset=us-ascii
Simcha Pollack <spollack@WINTHROP.ORG> wrote:
> We often need to do a power analysis for a proposed clinical
experiment
> that will generate longitudinal data. The available power formulas
and
> programs do not allow us to include realistic models, or they require
> specification of parameters that are difficult to estimate.
>
> Therefore, we are trying to write a simulation that will be able to
> generate longitudinal data which corresponds to an arbitrary model.
>
> For example, one active population is observed for 5 periods of
follow-up
> where the correlation between time points is .3 and the mean increases
on
> the average by 2% from one observation period to the next. The
population
> on placebo is similar except that the mean increases on the average by
1%
> from observation to the next.
>
> After creating this data, say, 5000 times, we plan to run each
instance
> through Proc Mixed, look how often certain parameters are significant
and
> thus obtain a simulation-based estimate of power.
>
> Are you aware of anyone who has done this? Are there books that help
with
> these power simulations?
I haven't seen anyone who has done this in a general fashion suitable
for
distribution. There are just too many possibilities to cram into one
little
(probably freeware) program. So I suggest you consider writing (or
having
someone else write) a macro to address your problem. I recommend that
you
consider using the concepts of my randomization-test macros for your
framework. ( http://www.wuss.org/conference/papers/DA07.pdf )
The basic idea would be as follows:
[1] allow enough macro parameters to control all your varying aspects:
sample sizes, number of periods, temporal correlation, average
increases
for the differing subpopulations, other error characteristics, etc.
[2] use the above macro variables to fill in a data step which would
generate
all N replicates, by replicate (so you won't even have to sort
afterward)
[3] feed this data set into PROC MIXED, allowing more macro variables to
change the form of the model and the tests, doing the MIXED
processing
BY REPLICATE
[4] use ODS to snag the relevant output data set(s), and a WHERE clause
to
trim the data set down to the apporpriate test statistic(s)
[5] feed the resultant data set(s) into a null data step which does
nothing
but compute the proportion of significant results, giving you the
simulation
estimate
HTH,
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician
|