Date: Wed, 7 Oct 2009 14:09:45 -0400
Reply-To: Susan Durham <sdurham@BIOLOGY.USU.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Susan Durham <sdurham@BIOLOGY.USU.EDU>
Subject: Re: contingency table test question
Content-Type: text/plain; charset=ISO-8859-1
Assuming that each row represents a sample, you want a chi-square test of
homogeneity of proportions. For a two-way table (but notably, *not* for
tables with more dimensions), the chi-square test of homogeneity of
proportions is equivalent to a chi-square test of independence. So you can use
proc freq data=your_dataset;
table row_variable * col_variable / nopercent nocol chisquare;
The "nopercent nocol" options turn off default statistics: the observed
frequency divided by the total frequency, and the observed frequency divided
by the column marginal totals, respectively. This leaves only the "row
percent"--the observed frequency divided by the row marginal total, which is
what you want to compare--in the output table.
Although there is only 1 df for this test, it does not collapse to a single
dimension. The two rows in the table represent two samples, so you are
doing a two-sample test to compare two proportions. You could, under
certain assumptions like sufficiently large sample sizes, accomplish this
comparison with a two-sample t-test. But for small samples, you are better
off with a chi-square test. Plus if your samples are too small for the
asymptotic chi-square test, you can use the EXACT statement in the FREQ
procedure to obtain an exact test. "Too small" is determined by the
expected frequencies (not the observed frequencies); FREQ will give you a
warning if the proportion of expected frequencies less than five is high.
This criterion is typically a bit conservative; see the texts on categorical
data analysis by Alan Agresti for details about how small is "too small."
But now we have exact tests easily available to us, and there's no reason
not to use them.
I believe that Daniel is describing a goodness-of-fit test, where you
compare observed proportions to "expected" proportions. This approach is
analogous to a one-sample test. The expected proportions represent a null
hypothesis and are determined by external considerations--like a 9:3:3:1
ratio for genetic crosses, or that habitat use is in proportion to known
Utah State University
On Wed, 7 Oct 2009 10:47:53 +0200, =?ISO-8859-1?Q?Daniel_Fern=E1ndez?=
>I am not so 'frequencied' with chi-square tests in spite of being statistician.
>A do more technical SAS bussines analytics than statistics.
>By the way, let�s make a try!:
>Your test keep being a 2x2 table (or n x m) where you want to test for
>of proportions between population labels, that is you want to test
>your first column
>porportions is equal or not to the second column, being those columns
>not a variable
>rather a population label or population condition.
>Then you can do the test :
>proc freq data= yourdataset;
> tables population_label /chisq testp=( X , Y) ;
>Where 'testp' test proportion sums 100 as percent (X for the first
>row, Y for the
>second row, (etc etc for multilevel variable, multiple rows))
>So if you know the marginal porportion for cell 1, for example 60% then
>you must code:
>proc freq data= yourdataset;
> tables population_label /chisq testp=( 60 , 40) ;
>Remember all cells must have 5 or more frequency counts.
>2009/10/6 elodie <email@example.com>:
>> On Oct 6, 2:36 pm, elodie <elodie.gill...@gmail.com> wrote:
>>> Hi everyone,
>>> I have a 2*2 contingency table.
>>> I would like to test whether proportion_in_row1_col1
>>> How do I go about doing that? I have read the manual of proc freq and
>>> I am not finding much that I think can be relevant.
>>> Thanks in advance for the help.
>> I am guessing that I am in a situation where I restrict the analysis
>> to only one column, so it is a not a two way table anymore, but rather
>> a one-way table.
>> Still, I am not sure I can specify the test of equality of proportion
>> in proc freq.