```Date: Mon, 26 Sep 2005 21:46:16 -0300 Reply-To: Hector Maletta Sender: "SPSSX(r) Discussion" From: Hector Maletta Subject: Re: How to compare survey data to census Comments: To: "Seumas P. Rogan" In-Reply-To: <20050927000215.D3921B422D8@smtpgate.email.arizona.edu> Content-Type: text/plain; charset="US-ASCII" I think the number of cases in the census should not enter your analysis. You should work only with the survey sample size. The census percent distribution of households by household size, applied to your survey sample, should be regarded as your "expected frequency", to contrast with the observed survey frequencies. In your example, the four census categories make a total of 185,108 cases, and the survey sample total is 1524. The first household size category has a relative frequency of 21.0% in the census, so it should be expected that 21.0% of the survey sample, or 320 cases, belong to the first size category. In practice, you got 328 cases in that category. The difference (328-320) goes into the calculation of chi square. Hector > -----Original Message----- > From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] > On Behalf Of Seumas P. Rogan > Sent: Monday, September 26, 2005 9:02 PM > To: SPSSX-L@LISTSERV.UGA.EDU > Subject: How to compare survey data to census > > Hi all, > > I want to compare the distribution of certain variables from > my survey sample with census data for the same population, > but I'm not sure which test to use. > I have, for example, a 4-row by 2-column table where one > column represents counts from the census, the second column > represents counts from my sample. > The four rows represent counts in 4 categories of household > size. In the following example, I get a Pearson Chi^2 of > 8.013, df=3, p=0.046, though the maximum difference in any > cell between the census and my survey for each size category is 1.1%. > > DATA LIST /SOURCE 1 SIZECAT 3 COUNT 5-10. > BEGIN DATA > 1 1 38857 > 1 2 64551 > 1 3 76809 > 1 4 4891 > 2 1 328 > 2 2 546 > 2 3 627 > 2 4 23 > END DATA. > FORMATS SOURCE SIZECAT COUNT (F6). > VALUE LABELS SOURCE 1 "CENSUS" 2 "SURVEY". > WEIGHT BY COUNT. > CROSSTABS > /TABLES=SIZECAT BY SOURCE > /FORMAT= AVALUE TABLES > /STATISTIC=CHISQ > /CELLS= COUNT COLUMN > /COUNT ROUND CELL . > > Does anyone have any suggestions or advice here? Are there > any heuristics or guidance regarding how to compare surveys > with 'truth' and how different distributions must be to be > 'different'? > > Thanks for any help! > > > SR > > sprogan@email.arizona.edu > > __________ Informacisn de NOD32 1.1233 (20050926) __________ > > Este mensaje ha sido analizado con NOD32 Antivirus System > http://www.nod32.com > > ```

