Date: Tue, 15 Mar 2011 10:30:27 -0700
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: OT, help to define S/N ratio (this is for statistician)
Content-Type: text/plain; charset=us-ascii
You might use robust regression to define the noise. What is not
clear to me is whether you are using overlapping moving windows.
I assume that you are. But overlapping windows will not be
independent which is probably a significant violation of assumptions
in a robust regression model. For purposes of defining noise,
you may want to use data from nonoverlapping windows when fitting
the robust regression.
I can't tell from here whether the robust regression model would
consist of just an intercept term or intercept and time effects.
Things to consider when deciding whether to include time are
1) is the background noise process linearly related to time, and
2) is the variance of the background noise process constant over
time despite an increasing/decreasing noise level.
If the background noise process does not change over time, then
you would include just an intercept in the robust regression model.
Noise would then be any quantity less than the intercept estimate
plus some multiplier K of the robust regression residual SD estimate.
If the background noise is increasing/decreasing over time but the
variance of the background noise process is constant, then you
would want to include time in your robust regression model. In
that case, noise would be any quantity less than b0 + b1*time + k*SD.
If the background noise process is increasing/decreasing over time
and the variance increases/decreases with the absolute noise level,
then you might first estimate the median value over time and divide
the observed value by the median. Hopefully, this returns noise
with constant variance as well as being time independent. If so,
then you can use robust regression with just an intercept term
as described above.
Fred Hutchinson Cancer Research Center
Ph: (206) 667-2926
Fax: (206) 667-5977
--- On Tue, 3/15/11, Ya Huang <ya.huang@AMYLIN.COM> wrote:
> From: Ya Huang <ya.huang@AMYLIN.COM>
> Subject: OT, help to define S/N ratio (this is for statistician)
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Tuesday, March 15, 2011, 9:04 AM
> Hi there,
> Suppose I have an event, that could happen any time during a fixed
> period, say 10 years. Sometimes, it happens very frequently during
> a short period, sometime, it may not happen for a long period. If I do a
> moving window (a window of one month, for example) and count how many
> events happens in that window, I will get a smoothed distribution curve
> over time. What I'm interested is to define an S/N ratio, so that I can
> find if this event shows any abnormality, or a spike/burst that should be
> concerned. I can easily draw the curve via proc gplot. The problem is
> that I have many of this kind of events, when they all drawn on the
> plot, it is too busy to see the 'Signal'. If I can define a S/N and
> filter out those low S/N event, I will be able to focus on those events
> with potential 'signal'.
> I'm thinking the max of the curve can be the 'S', what would be the 'N'?
> Should I use the total number of events for the 10 years? Or something