**Date:** Tue, 4 Oct 2005 13:50:00 -0700
**Reply-To:** David L Cassell <davidlcassell@MSN.COM>
**Sender:** "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
**From:** David L Cassell <davidlcassell@MSN.COM>
**Subject:** Re: Inquiry about appropriate statistics methodology
**In-Reply-To:** <200510041537.j94F8whB028639@malibu.cc.uga.edu>
**Content-Type:** text/plain; format=flowed
nancy_li66@YAHOO.COM wrote:
>I know there are many statistics experts in this forum. I would like to
>pick your brain for a few minutes. At present, I’m doing a project about
>the analysis of healthcare utilization over 16-quarter period starting from
>the last quarter of 2002 to the third quarter of 2006. The utilization
>measures will include ED visits, Hospital admissions, Outpatient visits,
>and total cost. The utilization measurement in each quarter will be the
>numerator, and the population identification in the corresponding quarter
>will be the denominator. For the final analysis there will be a time series
>of fourteen points (the first seven quarters are the base, the last seven
>quarters are the evaluation, and the middle two quarters will be skipped)
>for each utilization measure for each population. I would like to apply an
>appropriate statistical methodology to each of these series to determine
>whether there is a distinguishable break or change in the slope of the
>utilization trend line between the base periods and the evaluation periods.
> My question is what the statistical techniques are appropriate for this
>analysis. Could you please give me some suggestions? Any input would be
>highly appreciated.

I can make a few suggestions, but you don't have enough information yet to
assess the statistical methods.

[1] What does the literature say about the time series behavior of your
data, or
similar data? You don't have enough data to assess a real time series
model.
Sixteen quarters of data is just too short a stream to evaluate properly any
issues
like autoregressive character or moving average behaviors. Sixteen points
is so
short that a single outlier or measurement error could mess up your time
series
estimates. So you've got a problem. If it is reasonable to treat the data
as
AR(1), then you could treat the problem as a repeated measures problem.

[2] But whether this is reasonable is dependent on the data. If you are
looking
at a fraction for your dependent variable, don't expect normality for your
errors.
You may also need to transform the data in order to control some of the
potential
time series structure, as we often see in ARIMA modeling.

[3] Don't drop those middle two quarters. If you do have any time series
structure,
you'll lose important features of the data when you do this, and you'll mess
up the
ability to estimate the time series characteristics properly.

[4] There are a host of unanswered questions which could affect the way in
which
you analyze your data:
How many data points are you dealing with?
Do you have panel data (a whole set of sites or areas, all measured at the
same
times) to work with?
How many independent variables are you working with?
What sort of model are you hypothesizing?
What sort of change do you hypothesize would occur at your 'break'?

I would suggest that you sit down with a statistician and work through a
number of
these issues. He or she may see even more problems/features in the data. I
don't
think you can get a good answer for data this complicated, just by tossing
out a
paragraph of info and seeing what SAS-L throws back.

HTH,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/