In the past I have found it difficult to imagine why someone would want to
match-merge or interleave on the physical order of rows in a SAS dataset;
that is, I would have voted for the 'ERROR' option. After trying recently to
find a way to put a unique key in a SCORE dataset produced by a SAS
statistical procedure, I finally gave up and agreed to use a Data step MERGE
(Horrors!) with no BY group (INCONCEIVABLE!). SAS statistical procedures in
at least some instances generate row-wise computed values in the exact same
order of the rows in the source dataset. Matching on physical appears to be
the simplest way to link the computed values back to the source dataset.
Statisticians, you would think, would not want to paint themselves into a
corner with the ERROR option.
I must confess that on rare occasions I do find a better tool for a specific
purpose in the Data step MERGE toolbox than in the SAS SQL toolbox. In a
Model A world a crank works better than an ignition key.
From: SAS User [mailto:sas@SDAC.HARVARD.EDU]
Sent: Thursday, March 18, 2004 12:01 PM
Subject: mergenoby: opinions/arguments?
I have been horribly outvoted here regarding our system default for the new
mergenoby option. I have stood firm on the philosophy that you do not alter
user's defaults for them, despite how beneficial it might be. You can
strongly request that they alter their own defaults, sure. I even suggested
that we could poll everyone for their preference, and set up a script to
invoke SAS with one of the three options for this option, depending on the
users id and their response to the poll. This change had been discussed at
several meetings, where I eventually gave in and accepted that we will
change the default to warning. Unfortunately, I was away for the last
meeting where the big-shots here decided that they want the system-wide
option to be set to ERROR.
I absolutely refused to do this, as this will cause programs to halt that
were correct (in very rare instances here, yes). I don't see the benefit
here to have ERROR over WARNING for this option. I haven't yet heard their
reasons for this, but I'm assuming that it involves the exit status (UNIX
system here- most stats run processes in the background to get an exit
status) or the fact that bad programmers might just search a log file for
the string ERROR. But you shouldn't alter a default to stop a program
because of syntax that might be appropriate because of bad programming and
I'm emailing the list because I have been so outvoted on these issues (as
the only programmer against all statisticians), so I'm hoping that folks out
there might have stronger arguments than I have gathered. I need to have
some mighty ammo for the next meeting. Example: We discussed setting the
compress default to yes, until we learned that our old version of DBMS copy
can not read compressed SAS datasets. Any arguments regarding mergenoby
would be greatly appreciated.