Date: Wed, 21 May 2003 12:12:55 -0400
Reply-To: Quentin McMullen <Quentin_McMullen@BROWN.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Quentin McMullen <Quentin_McMullen@BROWN.EDU>
Subject: Re: Statement placement
In-Reply-To: <20030521150255.10029.qmail@web20203.mail.yahoo.com>
Content-Type: text/plain; charset="us-ascii"
David Kellerman wrote:
>
> I was just going through the archives looking for an empty
> dataset check and found this code:
>
> DATA _NULL_;
> IF HOWMANY = 0 THEN DO;
> your PUT statements here
> END;
> SET A3 NOBS=HOWMANY;
> RUN;
>
> What I am curious about is the placement of the set statement.
> It doesn't work if the SET is before the IF. Can anyone tell me why?
>
> Confused in NJ on a dull cloudy day
>
Hi David,
I think the key concept here is when does a datastep stop. Remember that
the data step is an implicit loop. So it needs a way to know when to stop.
One of the ways a data step will stop is if a SET statement reads an
end-of-file marker (my conceptual understanding, if not technically correct,
the idea is that the SET statement is trying to read the next record but
there is no next record). In the odd case of reading a dataset with 0 obs,
the first time the set statement executes, the data step will stop execution
at that point.
Here is an example showing when the step stops:
85 data a;
86 delete;
87 run;
NOTE: The data set WORK.A has 0 observations and 0 variables.
NOTE: DATA statement used:
real time 0.05 seconds
cpu time 0.05 seconds
88
89 data _null_;
90 put "Before set: I executed";
91 if HowMany = 0 then do;
92 put "NOTE: No records.";
93 end;
94 put "Before set: I executed also";
95 set a nObs=HowMany;
96 put "After set: I did not execute";
97 run;
Before set: I executed
NOTE: No records.
Before set: I executed also
NOTE: There were 0 observations read from the data set WORK.A.
NOTE: DATA statement used:
real time 0.02 seconds
cpu time 0.02 seconds
Note that the final put statement did NOT execute, because the data step
stopped when the set statement executed.
And in the next example I moved the SET statement before the IF, and you can
again see that the data step stops immediately upon execution ot the SET
statement, so the IF statement will never be evaluated.
98
99 data _null_;
100 put "I executed";
101 set a nObs=HowMany;
102 put "I did not execute";
103 if HowMany = 0 then do;
104 put "NOTE: No records.";
105 end;
106 run;
I executed
NOTE: There were 0 observations read from the data set WORK.A.
NOTE: DATA statement used:
real time 0.00 seconds
cpu time 0.00 seconds
As others have mentioned, it's not necessary to read the entire dataset in
order to use the nObs option. But I think the core issue here is
understanding the implied data step loop.
Kind Regards,
--Quentin