Date: Wed, 21 Mar 2007 10:34:55 +1100
Reply-To: "Johnson, David" <David.Johnson@CBA.COM.AU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Johnson, David" <David.Johnson@CBA.COM.AU>
Subject: Unexpected behaviour: indexes recreated before a table is deleted.
Content-Type: text/plain; charset="us-ascii"
For reasons I don't want to explore in this forum, a large SAS table
(14.3Gb) was copied from one physical location to another using Windows
explorer.
The table was associated with an index that produced another file that
was 1.5Gb in size.
It appears the index file was not copied with the table.
During an overnight batch, a library cleanup utility attempted to delete
the table with the following (slightly edited) message.
NOTE: Deleting DATATEST.ABCDEFGHIJK (memtype=DATA).
ERROR: Delete lock is not available for
DATATEST.ABCDEFGHIJK.DATA, lock held by another
process.
Overall, the deletion step removed 166 SAS tables covering 36.5Gb and
took these resources:
NOTE: Statements not processed because of errors noted above.
NOTE: The SAS System stopped processing this step because of
errors.
NOTE: SAS set option OBS=0 and will continue to check
statements.
This may cause NOTE: No observations in data set.
NOTE: PROCEDURE DATASETS used (Total process time):
real time 15.57 seconds
cpu time 0.26 seconds
The first NOTEs indicate the problem.
It is clear no-one else had a lock on the table, yet something had
caused the deletion to fail. On a hunch, code was written as shown in
the following log. Two empty libraries were assigned, a large table
written into one with an index, and the table part copied to the other
library. Then a table deletion was performed.
63 Libname FOLDER1 "C:\Temp\Folder1";
NOTE: Libref FOLDER1 was successfully assigned as follows:
Engine: V9
Physical Name: C:\Temp\Folder1
64
65 Libname FOLDER2 "C:\Temp\Folder2";
NOTE: Libref FOLDER2 was successfully assigned as follows:
Engine: V9
Physical Name: C:\Temp\Folder2
66
67 /* Create some dummy data */
68 Data FOLDER1.TRANSACT( Index = ( CLIENT) );
69 Do CLIENT = 1 To 1000000 By 1;
70 Do TRANSNUM = 1 To 100 By 1;
71 TRANSVALUE = Round( RanUni( TRANSNUM) * 1000, 0.01);
72 Output;
73 If TRANSVALUE >= 990 Then TRANSNUM = 100;
74 End;
75 End;
76 Run;
NOTE: The data set FOLDER1.TRANSACT has 63404257 observations and 3
variables.
NOTE: Simple index CLIENT has been defined.
NOTE: DATA statement used (Total process time):
real time 1:30.57
user cpu time 49.25 seconds
system cpu time 22.24 seconds
Memory 68600k
77
78
79 /* Copy the data as a physical file to another folder. */
80 FileName SYSRQST Pipe "Copy C:\Temp\Folder1\Transact.Sas7bdat
C:\Temp\Folder2\Transact.Sas7bdat";
81
82 Data _NULL_;
83 InFile SYSRQST;
84 Input;
85 Put _InFile_;
86 Run;
NOTE: The infile SYSRQST is:
Unnamed Pipe Access Device,
PROCESS=Copy C:\Temp\Folder1\Transact.Sas7bdat
C:\Temp\Folder2\Transact.Sas7bdat,
RECFM=V,LRECL=256
1 file(s) copied.
NOTE: 1 record was read from the infile SYSRQST.
The minimum record length was 25.
The maximum record length was 25.
NOTE: DATA statement used (Total process time):
real time 41.67 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
Memory 144k
87
88
89 Proc DataSets Lib = FOLDER2 NoList;
90 Delete TRANSACT;
91 Quit;
NOTE: Indexes recreated:
1 Simple indexes
NOTE: Deleting FOLDER2.TRANSACT (memtype=DATA).
NOTE: PROCEDURE DATASETS used (Total process time):
real time 2:12.75
user cpu time 37.18 seconds
system cpu time 15.72 seconds
Memory 68599k
The demonstration data set is 1.5Gb in size, so it is actually around
10% of the file size that the batch was trying to delete last night.
As the log shows, the DataSets procedure recreates the index before
deleting the table. This recreation takes more than 2 minutes, so we
might expect the recreation of the index on last nights table would have
taken a while longer. Yet the whole procedure ran within 15 seconds.
This suggests that the DataSets procedure has some internal cleverness
that verifies a table before performing any actions upon it. When the
procedure was first introduced around version 6.07, a table would simply
be reported as damaged, and the SAS user would call the DataSets
procedure with a "repair" option.
Unfortunately this has two likely outcomes:
1. If space is short, we may see a failure of the index rebuild before
the deletion occurs. This would seem anomalous if we are deleting
tables to recover space, but have to use space before we recover it.
That reversed logic sounds like a little like an implementation of the
419 scam, where to get your ill-deserved millions of someone else's
alleged money, you have to pay substantial fees.
2. Where some set of circumstances that probably include running in
batch arise, the deletion may fail because the DataSets procedure locks
up a table on which it wants a lock. (One can see the message is quite
specific "lock held by another process"; not "lock held by another
user".) One usually only sees that anomalous behaviour when one opens a
table in ViewTable and then attempts to process it from a program editor
window in the same SAS session. I suspect that size may be an influence
as well, since it will influence the time it takes to rebuild the index,
and to perform the deletion of the files.
There are work-arounds, one of which is that the file should not have
been copied as a file, but instead recreated from one library to another
with the Copy or DataSets procedure. Indexes would have been recreated
and the problem wouldn't arise.
Still, it is another caveat emptor for data set indexes.
I couldn't find a relevant note on the support web site.
Kind regards
David
************** IMPORTANT MESSAGE *****************************
This e-mail message is intended only for the addressee(s) and contains information which may be
confidential.
If you are not the intended recipient please advise the sender by return email, do not use or
disclose the contents, and delete the message and any attachments from your system. Unless
specifically indicated, this email does not constitute formal advice or commitment by the sender
or the Commonwealth Bank of Australia (ABN 48 123 123 124) or its subsidiaries.
We can be contacted through our web site: commbank.com.au.
If you no longer wish to receive commercial electronic messages from us, please reply to this
e-mail by typing Unsubscribe in the subject line.
**************************************************************