Date: Wed, 25 Mar 2009 11:45:24 -0400
Reply-To: Kevin Viel <citam.sasl@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Kevin Viel <citam.sasl@GMAIL.COM>
Subject: Re: How can I detect any real deviation from a uniform monthly
distribution?
On Tue, 24 Mar 2009 08:04:12 -0700, Irin later <irinfigvam@YAHOO.COM>
wrote:
>I have a file of unique patients who had diagnosis "Depression" during
the calendar year.
>For each of the patients I have Month of Birth value (1-12).
>
>I expect seasonal variations in the diagnosis of depression (depending on
what was
>the month of the birth value).
>How can I validate or disprove this hypothesis? How can I detect any real
deviation
>from a uniform monthly distribution?
>?
>How to implement it in SAS code?
>
>Could you, please, give me a hand?
Both Mary and Peter have suggested that you might need controls, as a way
to estimate the number of births in a given month among your *study*
population. This is the minimum, as numerous and important confounders
likely exists. However, if you are willing to (tenuously) assume that
births are constant across months, perhaps a stronger argument in a large
population, then you might expect an equal distribution of months. I have
simulated this below and show one way to test it.
Note that the seasons below are very artificial.
proc plan seed = 1 ;
factors P = 144000 ordered M = 1 of 12 / NoPrint ;
output out = depmon ;
run ;
/*
proc freq data = depmon ;
tables M ;
run ;
*/
data depmonseason ;
set depmon ;
select ( M ) ;
when ( 1 , 2 , 3 ) Season = "Winter" ;
when ( 4 , 5 , 6 ) Season = "Spring" ;
when ( 7 , 8 , 9 ) Season = "Summer" ;
when ( 10 , 11 , 12 ) Season = "Fall" ;
otherwise put M= ;
end ;
run ;
proc freq data = depmonseason ;
tables M Season / chisq ;
run ;
Importantly, the following shows that the proportions are compared to each
other per say, and not to that expected from N / 12, where N = the sample
size. This is unlikely to occur as you might expect at least one birth in
each month.
proc freq data = depmonseason ( where = ( M ~ in ( 1 , 2 ))) ;
tables M / chisq ;
run ;
The next step would be a multiple regression, the type of which would
depend on your hypothesis (more births in the winter versus other months,
which could be logisitic). You could then control for some covariates of
interest. Again, you might be assuming that birth occur equally in each
month.
HTH,
Kevin