Date: Tue, 8 Oct 1996 11:47:59 -0700
Reply-To: Chris B Long <Chris_B_Long@SANDWICH.PFIZER.COM>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Chris B Long <Chris_B_Long@SANDWICH.PFIZER.COM>
Organization: Pfizer Central Research, Sandwich, Kent, UK
Subject: 'Proper' use of IF-THEN-ELSE (was IF-THEN-ELSEs in a DATA step)
> > What is the maximum number of allowed IF-THEN conditions allowed in
> > a data step? One of my colleagues ran into this recently. He had
> > about 100 IF-THEN-ELSEs in a single data step, and found that the
> > rules he had defined for the conditions at the end of the list were
> > not being implemented. Thanks in advance for responses to this
> > question.
> > Devi Katikineni
This question raises a question of programming style, as well as problems with
nested
IF statements. In much of the user-written code that I see, there's actually
no need
to use nested IF statements at all. If the conditions of your various IF
statements
are mutually exclusive, why not just code them as sequential IF statements?
For
example, I often see things like:
if name='Fred' then do;
...
end; else if name='Mary' then do;
...
end; else if name='Ann' then do;
...
end;
This is logically equivalent to sequential IF statements:
if name='Fred' then do;
...
end;
if name='Mary' then do;
...
end;
if name='Ann' then do;
...
end;
but the latter doesn't involve any problems with potential stack overflows etc.
Some
would claim that the first example is better because it will run faster
(because for
observations where name='Fred', the other conditions never have to be
evaluated), but
to use this style to write nested IF statements for hundreds or thousands of
mutually
exclusive conditions is surely asking for trouble. I prefer the second style
on the
grounds that it's both clearer to read and logically closer to the semantic
meaning
of the code. An even better solution is to use the SELECT statement, which is
purpose-made for this sort of job.
Conversely, one of the same users who uses the first style above also wrote a
step
similar to the following:
if age < 20 then do;
...
end;
if 20 <= age < 30 then do;
...
end;
if 30 <= age < 40 then do;
...
end;
This is (arguably) the sort of case which can make good use of nested IF
statments:
if age < 20 then do;
...
end; else if age < 30 then do;
...
end; else if age < 40 then do;
...
end;
In this case, the second style is better, as it's easier to read, and is easier
to
maintain - if one of the boundaries between categories changes, it only has to
be
changed in one place in the code, instead of two.
Just my opinion, anyone else have any comments?
Chris.