Date: Mon, 26 Oct 1998 09:50:19 +0100
Reply-To: Rolf Kjoeller <rolf.kjoeller@GET2NET.DK>
Sender: "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From: Rolf Kjoeller <rolf.kjoeller@GET2NET.DK>
Subject: Re: Is my selection procedure random ?
Content-Type: text/plain; charset="iso-8859-1"
Hi SPSS-X
- Paul Newton and I have been corresponding a bit offlist about his samplingquestion. I think what we have come up with is in line with the suggestions given by David F. Greenberg and Dale Pietrzak:
- since the sampling is done with replacement, you can see the selection of one particular word as a binomialexperiment with n=2468 and pii=1/4724. The following are the probabilities of selecting one particular word 0 to 10 times in such an experiment. cpx are cummulated probabilities, p(x<=X), while px is p(x=X). Judging from this, selecting one particular word 10 times is rather unlikely:
x cpx px
0 ,5930 ,5930
1 ,9029 ,3099
2 ,9839 ,0809
3 ,9980 ,0141
4 ,9998 ,0018
5 1,0000 ,0002
6 1,0000 ,0000
7 1,0000 ,0000
8 1,0000 ,0000
9 1,0000 ,0000
10 1,0000 ,0000
Paul has suggested, and I tend to agree, that you can use these probabilities to build a table with expected numbers of words drawn x times, and actual numbers of words drawn x times, and then do a chisquaretest:
x sample expected
0 n11 n12
1 n21 n22
2 n31 n32
3 n41 n42
... ... ...
10+ n101 n102
Does this make sense to you?
Rolf Kjoeller
--
e-mail: rolf.kjoeller@get2net.dk
webpage: http://hjem.get2net.dk/rolf.kjoeller
|