Date: Fri, 16 Nov 2001 11:38:52 -0500
Reply-To: Perry Watts <wattsp@DCA.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Perry Watts <wattsp@DCA.NET>
Subject: A RETAIN Oddity
Content-Type: text/plain; charset="iso-8859-1"
A follow-up further below to:
--------------------------------------------------------------------------
Date: Thu, 15 Nov 2001 18:08:11 -0500
From: Ian Whitlock <WHITLOI1@WESTAT.COM>
Subject: Re: A RETAIN Oddity
Bill,
There is a subtle difference.
put _all_ ;
is a command to write the PDV as it is defined when the DATA step has
finished compiling. In this case _ALL_ is not a list, but a special key
word to modify the command.
put (_all_) (=) ;
is different. Here _ALL_ is a list referring to all the variables in the
PDV at the moment this line is compiled. Note that _N_ and _ERROR_ are
missing because they are not added to the PDV until the end of compilation.
Part of what I find so disturbing is the fact that X1-X5 are neither in, nor
not in, the PDV. The second line is something like a quantum state
calculation that forces a particular state while, without the calculation,
another state is achieved. It is sort of reminiscent of Schroedinger's cat,
which is neither dead nor alive until the box is opened.
Ian
----------------------------------------------------------------------------
David Ward may be on to something here when he suggests that
put statements can create variables.
I have only seen the parentheses associated with formatting such as
put(_all_) (5.);
Or combined as:
put(_all_) (= 5.);
Could it be that the format which occurs at compilation time (Bob Virgile) is
creating the retained, not-present variables? Does the (=) alone contain
an implicit format?
I have a lot of trouble distinguishing between compile and run-time in SAS --
probably due to a background in a 3-GL. But here is a take-off on the
Retain Oddity program submitted by Ian -- and its output:
data qq ;
input x5;
cards;
1
2
3
;
run ;
data ww ;
retain x1-x5 ;
set qq ;
n=_n_;
err=_error_;
put (_all_) (=);
vv=x5;
put _all_;
tt=vv;
run ;
NOTE: Variable x1 is uninitialized.
NOTE: Variable x2 is uninitialized.
NOTE: Variable x3 is uninitialized.
NOTE: Variable x4 is uninitialized.
x1=. x2=. x3=. x4=. x5=1 n=1 err=0
x1=. x2=. x3=. x4=. x5=1 n=1 err=0 vv=1 tt=. _ERROR_=0 _N_=1
x1=. x2=. x3=. x4=. x5=2 n=2 err=0
x1=. x2=. x3=. x4=. x5=2 n=2 err=0 vv=2 tt=. _ERROR_=0 _N_=2
x1=. x2=. x3=. x4=. x5=3 n=3 err=0
x1=. x2=. x3=. x4=. x5=3 n=3 err=0 vv=3 tt=. _ERROR_=0 _N_=3
NOTE: There were 3 observations read from the data set WORK.QQ.
NOTE: The data set WORK.WW has 3 observations and 9 variables.
-----
Is this how the above output should be interpreted?
Only variables x1, x2, x3, x4, x5, n, and err exist in the PDV when
put (_all_) (=);
is initially compiled.
All 9 variables exist in the PDV when
put _all_;
is executed. In addition 'tt' is set to missing at execution time.
Thanks for a very interesting puzzle.
Perry
--------------------------
Perry Watts
wattsp@dca.net
--------------------------