Date: Fri, 13 Mar 2009 16:36:09 -0500
Reply-To: Tim Kynerd <tkynerd@ECD.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Tim Kynerd <tkynerd@ECD.ORG>
Subject: Re: SUM statement paradox (bug)?
Content-Type: text/plain; charset="us-ascii"
To echo the other folks, here's the behavior I'm seeing in SAS 9.1.3:
When I run your code, I get what you get: var=. var=7 .
When I comment out the SET statement in the second DATA step, I get: var=0 . As I would expect, the second DATA step only executes once because it has no input.
I knew about the implied RETAIN with the Sum statement, but I didn't know this:
Tip: The variable is automatically set to 0 before SAS reads the first observation. The variable's value is retained from one iteration to the next, as if it had appeared in a RETAIN statement.
Tip: To initialize a sum variable to a value other than 0, include it in a RETAIN statement with an initial value.
As to why VAR is initially missing in the first step:
It is redundant to name any of these items in a RETAIN statement, because their values are automatically retained from one iteration of the DATA step to the next:
variables that are read with a SET, MERGE, MODIFY or UPDATE statement
a variable whose value is assigned in a sum statement
the automatic variables _N_, _ERROR_, _I_, _CMD_, and _MSG_
variables that are created by the END= or IN= option in the SET, MERGE, MODIFY, or UPDATE statement or by options that create variables in the FILE and INFILE statements
data elements that are specified in a temporary array
array elements that are initialized in the ARRAY statement
elements of an array that have assigned initial values to any or all of the elements on the ARRAY statement.
You can, however, use a RETAIN statement to assign an initial value to any of the previous items, with the exception of _N_ and _ERROR_.
The fact that SAS emphasizes the option of using a RETAIN statement to assign an initial value makes it seem natural to me, if not strictly *logical*, that variables in these cases are initialized to missing, as they apparently are.
4 Old River Place, Suite A
Jackson, MS 39202
P: (601) 944-9308
F: (601) 944-0808
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Paul Dorfman
Sent: Friday, March 13, 2009 1:16 PM
Subject: SUM statement paradox (bug)?
For those inclined to "ponder over many a quaint and curious volume of
forgotten lore", consider:
data huh ;
retain var 7 ;
data _null_ ;
put var = @ ;
var + 1 ;
set huh ;
Paradox #1. I expect the second step to print var=0 var=7. It prints var=.
var=7 instead, as if the SUM statement were not present.
Paradox #2. If VAR is not in HUH, or is dropped from it, or SET is simply
commented out, the step happily prints the expected var=0 var=7.
It appears that the mere act of the compiler finding VAR in the descriptor
of a data set named in a syntactically valid context changes the implicit
sum-statement compiler action into RETAIN VAR . instead of the expected
RETAIN VAR 0 (or, perhaps more likely, just abstains from setting VAR to
zero and leaves it missing).
I discovered it a hard way - a production load failed having detected a
missing value for a NOT NULL column - and stared at the step in giddy
incredulity for quite a while before I found that the reason was the SUM
statement behaving not in the way it is documented.
For nowhere in the SAS documentation this peculiar behavior is mentioned. I
only find the unequivocal assertions "the variable is automatically set to
0 before SAS reads the first observation" and "The sum statement is
equivalent to using the SUM function and the RETAIN statement, as shown
here: retain variable 0; variable=sum(variable,expression);", for some
reason listed under the "Tip" rubrics, as if the fundamental property of
the SUM statement seared in the mind of any semicolon-worthy SAS programmer
is but a useful side effect.
Since both directly contradict the experimental results capable of biting
an unwary, I hereby declare it a bug.
This transmission is intended only for the use of the addressee and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately via e-mail at firstname.lastname@example.org.