Date: Wed, 10 Jan 2007 17:42:41 -0500
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Copy of dataset corrupted with OS tools
In-Reply-To: <82C4208201BE3645B6840015C9DD2CBAB45E87@bhqroc1ex2.archq.ri.redcross.net>
Content-Type: text/plain; charset="us-ascii"
Ed:
I suspect a virus .... Possibly HTLV-II :>
SAS usually won't open standard SAS datasets unless the contents match
the header. Transport or xpt datasets, on the other hand, have weak
internal consistency checks.
In an earlier era, SAS experts advised against using OS shell commands
to copy datasets. I have not heard any cautions along those lines
recently. The shell command dif should report any differences quickly
and efficiently. You don't want to do that routinely, though, and I'd
focus on the disk operating system and the disk array. I'd be surprised
if file corruption during copying would have anything to do with SAS
datasets alone, other than it might be easier to notice differences in
datasets. After all, bits is bits.
Sig
-----Original Message-----
From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
On Behalf Of NotariE@usa.redcross.org
Sent: Wednesday, January 10, 2007 8:28 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Copy of dataset corrupted with OS tools
Hello folks,
We recently had a moderately large dataset that corrupted during a copy.
We run an Alpha system with Tru64 UNIX v5.1A. The dataset was just
created on a RAID 5 volume and was copied ("cp" command) to another RAID
5 volume on the same SAN.
The corruption was insidious (19 records out of 80,000,000) and
clustered, as far as I can tell. The byte size of the files were
identical, and a simple PROC CONTENTS doesn't show anything odd (no
surprise). None of the logs (binary.errlog, etc) showed any odd
behavior during the time the file was transferred or afterward.
The upshot of all of this is;
1) What are SAS folk using to assure that copying datasets has worked?
2) Do I need to "touch" each record to verify the starting dataset is
the same as the ending dataset?
Ed Notari
Transmissible Diseases Department
Jerome H. Holland Laboratory
American Red Cross