Date: Thu, 8 Oct 2009 16:07:20 -0400
Reply-To: Susan Durham <sdurham@BIOLOGY.USU.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Susan Durham <sdurham@BIOLOGY.USU.EDU>
Subject: Re: contingency table test question
Content-Type: text/plain; charset=ISO-8859-1
TESTP is the perfect option for goodness-of-fit tests.
However, because Elodie said it is a 2x2 table and given the description of
the desired comparison, I think the homogeneity of proportions test is more
appropriate, as in
http://www.stattutorials.com/SAS/TUTORIAL-PROC-FREQ-2.htm
I could have misinterpreted the question, of course.
--Susan
On Thu, 8 Oct 2009 18:13:57 +0200, =?ISO-8859-1?Q?Daniel_Fern=E1ndez?=
<fdezdan@GMAIL.COM> wrote:
>hi,
>
>Susan, I used the 'testp' option for testing proportions.
>
>For more information, use the SAS help documentation for this option,
>and take a look at:
>
>http://www.stattutorials.com/SAS/TUTORIAL-PROC-FREQ-1.htm
>
>Daniel Fernandez.
>Barcelona.
>
>2009/10/7 Susan Durham <sdurham@biology.usu.edu>:
>> Assuming that each row represents a sample, you want a chi-square test of
>> homogeneity of proportions. For a two-way table (but notably, *not* for
>> tables with more dimensions), the chi-square test of homogeneity of
>> proportions is equivalent to a chi-square test of independence. So you
can use
>>
>> proc freq data=your_dataset;
>> table row_variable * col_variable / nopercent nocol chisquare;
>> run;
>>
>> The "nopercent nocol" options turn off default statistics: the observed
>> frequency divided by the total frequency, and the observed frequency divided
>> by the column marginal totals, respectively. This leaves only the "row
>> percent"--the observed frequency divided by the row marginal total, which is
>> what you want to compare--in the output table.
>>
>> Although there is only 1 df for this test, it does not collapse to a single
>> dimension. The two rows in the table represent two samples, so you are
>> doing a two-sample test to compare two proportions. You could, under
>> certain assumptions like sufficiently large sample sizes, accomplish this
>> comparison with a two-sample t-test. But for small samples, you are better
>> off with a chi-square test. Plus if your samples are too small for the
>> asymptotic chi-square test, you can use the EXACT statement in the FREQ
>> procedure to obtain an exact test. "Too small" is determined by the
>> expected frequencies (not the observed frequencies); FREQ will give you a
>> warning if the proportion of expected frequencies less than five is high.
>> This criterion is typically a bit conservative; see the texts on categorical
>> data analysis by Alan Agresti for details about how small is "too small."
>> But now we have exact tests easily available to us, and there's no reason
>> not to use them.
>>
>> I believe that Daniel is describing a goodness-of-fit test, where you
>> compare observed proportions to "expected" proportions. This approach is
>> analogous to a one-sample test. The expected proportions represent a null
>> hypothesis and are determined by external considerations--like a 9:3:3:1
>> ratio for genetic crosses, or that habitat use is in proportion to known
>> habitat availability.
>>
>> HTH,
>> Susan
>>
>> ---
>> Susan Durham
>> Ecology Center
>> Utah State University
>>
>> On Wed, 7 Oct 2009 10:47:53 +0200, =?ISO-8859-1?Q?Daniel_Fern=E1ndez?=
>> <fdezdan@GMAIL.COM> wrote:
>>
>>>hi Elodie,
>>>
>>>I am not so 'frequencied' with chi-square tests in spite of being
statistician.
>>>A do more technical SAS bussines analytics than statistics.
>>>By the way, let�s make a try!:
>>>
>>>Your test keep being a 2x2 table (or n x m) where you want to test for
>>>the equality
>>>of proportions between population labels, that is you want to test
>>>your first column
>>>
>>>porportions is equal or not to the second column, being those columns
>>>not a variable
>>>rather a population label or population condition.
>>>
>>>Then you can do the test :
>>>
>>>proc freq data= yourdataset;
>>> tables population_label /chisq testp=( X , Y) ;
>>> run;
>>>
>>>Where 'testp' test proportion sums 100 as percent (X for the first
>>>row, Y for the
>>>second row, (etc etc for multilevel variable, multiple rows))
>>>
>>>So if you know the marginal porportion for cell 1, for example 60% then
>>>you must code:
>>>proc freq data= yourdataset;
>>> tables population_label /chisq testp=( 60 , 40) ;
>>> run;
>>>
>>>Remember all cells must have 5 or more frequency counts.
>>>
>>>
>>>Daniel Fern�ndez.
>>>Barcelona.
>>>
>>>2009/10/6 elodie <elodie.gillain@gmail.com>:
>>>> On Oct 6, 2:36 pm, elodie <elodie.gill...@gmail.com> wrote:
>>>>> Hi everyone,
>>>>>
>>>>> I have a 2*2 contingency table.
>>>>>
>>>>> I would like to test whether proportion_in_row1_col1
>>>>> =proportion_in_row2_col1.
>>>>>
>>>>> How do I go about doing that? I have read the manual of proc freq and
>>>>> I am not finding much that I think can be relevant.
>>>>>
>>>>> Thanks in advance for the help.
>>>>
>>>> I am guessing that I am in a situation where I restrict the analysis
>>>> to only one column, so it is a not a two way table anymore, but rather
>>>> a one-way table.
>>>>
>>>> Still, I am not sure I can specify the test of equality of proportion
>>>> in proc freq.
>>>>
>>
|