|
> From: David N. Beede [mailto:David.Beede@MAIL.DOC.GOV]
> I face the ugly possibility of having to process what I
> consider to be a lot of
> hard copy data (about a one-foot stack of paper). Almost all
> of it is typed
> (i.e., not hand-written) but in different tabular formats.
> As I see it, I can
> either scan in the data and write a SAS program to get the
> data into shape; I
> can write a program (SAS? Excel?) to create an on-screen
> form to facilitate
> keypunching the data; or I can contract out the keypunching.
many different tabular formats: go keypunch
> I assume others have or will face this kind of problem. My
> questions are:
> -- is scanning more trouble than it is worth?
Volume is the determining factor.
Just like with a press run: once you do the set-up -- the expensive part --
extra copies are just the cost of running the press.
Writing a scanning program is _not_ an ad hoc process.
Most scanning packages will provide both a method of designing the data
collection form and validating the data.
Register is key when scanning a document, just getting the four corners of
the page lined up so we know where they are before we start the work: we
can't have them suckers running in thru the scanner cock-eyed, y'know. ;-)
> -- does scanning require additional software other than what
> comes with the scanner?
Surprise: you get get either bit-maps or text!
Be prepared to spend a whole bunch of time
-- remember the 80/20 rule --
reviewing the data coming off the paper.
My blue-sky approach to this has always been to suggest that folks scan
twice:
once at high-resolution, meaning slow reading time, high accuracy
again at low-resolution, meaning fast reading time, low accuracy
then use proc COMPARE to find differences.
My visionary approach hasn't been implemented, because of contracting of
data transfer.
<sigh> well, under-appreciation happens! LOL
> -- are there any SAS-related issues in processing scanned data?
You better have your data dictionary written,
and not chiseled in stone,
before you begin this project.
> -- are there any published references on how to create forms
> for keypunching data?
http://www.google.com scanning
Ron Fehd the macro maven CDC Atlanta GA USA RJF2@cdc.gov
---> cheerful provider of UNTESTED opinions !*! <---
|