LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2007, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 10 Jan 2007 23:22:53 -0800
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: Copy of dataset corrupted with OS tools
In-Reply-To:  <1168444588.948987.48440@o58g2000hsb.googlegroups.com>
Content-Type: text/plain; format=flowed

rolandberry@HOTMAIL.COM replied: > >xav wrote: > > Hello Ed; > > > > If my return code <$?> of my cp command is equal to 0 > > I assume that the cp command is ok. > > > > $> cp toto.sasbdat titi.sas7bdat > > $> echo $? > > > > If it's an "important" dataset I'am using "proc compare" too. > > > > Xavier > > > > > > > > Ed Notari a écrit : > > > Hello folks, > > > > > > We recently had a moderately large dataset that corrupted during a >copy. > > > We run an Alpha system with Tru64 UNIX v5.1A. The dataset was just > > > created on a RAID 5 volume and was copied ("cp" command) to another >RAID > > > 5 volume on the same SAN. > > > > > > The corruption was insidious (19 records out of 80,000,000) and > > > clustered, as far as I can tell. The byte size of the files were > > > identical, and a simple PROC CONTENTS doesn't show anything odd (no > > > surprise). None of the logs (binary.errlog, etc) showed any odd > > > behavior during the time the file was transferred or afterward. > > > > > > The upshot of all of this is; > > > > > > 1) What are SAS folk using to assure that copying datasets has worked? > > > 2) Do I need to "touch" each record to verify the starting dataset is > > > the same as the ending dataset? > > > > > > > > > > > > > > > Ed Notari > > > Transmissible Diseases Department > > > Jerome H. Holland Laboratory > > > American Red Cross

> >This is very worrying for people handling clinical data on a >Unix/Linux/AIX platform and those who do so should TAKE CAREFUL NOTE of >this problem. If it is not practical to do a "proc compare" for every >dataset copied then perhaps, at the very least, the "sum" command could >be used within the script doing the copying to ensure the checksums of >the original and copied file are the same. > >If in this situation it would be a very good idea to copy this email up >your line management to make sure they are aware of this problem so >they can ensure a technical solution can be put in place to prevent a >bad copy ever going undetected. > >Roland

I'd go farther than that.

This should be very worrying for people handling *any* valuable data, on *any* platform. The bigger the platform, the more tools there are in place to address this kind of problem.

Of course, for those people keeping their critical data in Excel spreadsheets, there are bigger things to worry about first.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ From photos to predictions, The MSN Entertainment Guide to Golden Globes has it all. http://tv.msn.com/tv/globes2007/?icid=nctagline1


Back to: Top of message | Previous page | Main SAS-L page