LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2001, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 1 Sep 2001 19:58:33 GMT
Reply-To:     Xlr82sas <xlr82sas@AOL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Xlr82sas <xlr82sas@AOL.COM>
Organization: AOL http://www.aol.com
Subject:      Re: Cooperative work on 2000 Census data

Hi Dick,

Thanks. I will order the SSU proceedings.

You might want to visit my site, members.aol.com/xlr82sas/utl.html. I posted some code which defines column names and labels for the 39 segments used in the SF1 detailed data. The code builds SAS tables automatically from the meta data.

I use the data for customer relationship modelling (CRM). We help companies understand their customers. We also build predictive models for customer behavior.

I have added zip+4 and census 1990 geocodes to the ethnic redistricting Census 2000 data. I have used these counts in several customer behavior models. Census 2000 provided better predictors than the 1990 Census data, significantly reducing the sum of squares error. (comparing the two models)

I am combining all states into one SAS table, the QC is very time consuming. ( This will reduce the 2080 (40x52 States) Census 2000 zip files into 40 SAS tables - 50Gb total with compress=binary??)

I hope to end up with a simple star schema of about 6 tables. I expect to drop about 2/3 of the columns and half the rows?? For instance: tree race counts like Asian, African American and Alaskan Natives could be dropped since the coverage is very low.

==================================================================

This may be of help, to all that are compiling the detailed data. Any independent confirmation is welcome.

Segment One (52 States/Puerto Rico/DC) of the SF1 data has 9,541,315 records This count should match the Geographic Headers and about 8 other segments All other segments appear to contain 724,015 records.

For segment one.

I have record counts by state, in a SAS table. I have Column max, mins and sums by state and overall, in a SAS table.

The data appears quite clean, so far.

Roger

Roger J DeAngelis CompuCraft Inc XLR82SAS@aol.com ( Accelerate to SAS ) http://members.aol.com/xlr82sas/utl.html


Back to: Top of message | Previous page | Main SAS-L page