|
Paul,
Not quite. I think the important point is that IFN can be used in
many places because it is a simple function call. If I had to choose,
I would prefer universality over an IFN limited to the DATA step, and
the consistency of - the side effect of a function call is that the
arguments are evaluated.
Do you really deal with clients who care how you write code? There
were times when I did wish someone had paid attention to
maintainability, but I cannot remember a single situation where I was
told you have to code it in any particular way.
Ian Whitlock
==============
Date: Wed, 21 Jan 2009 04:44:34 +0000
From: Paul Dorfman <sashole@BELLSOUTH.NET>
Organization: PDC
Subject: Re: IFN function - strange behaviour
Comments: To: "Howard Schreier <hs AT dc-sug DOT org>"
<schreier.junk.mail@GMAIL.COM>
In-Reply-To: <200901210326.n0KKuYbt010850@malibu.cc.uga.edu>
Content-Type: text/plain; charset="utf-8"
Howard,
Thanks for detailed resurrection. I certainly agree that
> "IFN(and cousins) perhaps should be more tightly bound into the
> expression evaluators, so they could selectively evaluate their
> arguments, but that's not the implementation today.
What they appear to be saying is that they care more about the way we
they doing it than the way it works when we use it. I find this
excuse, hmmm..., not entirely sufficient. In effect, I (and you) also
make living selling software we write (at a different level and under
a different arrangement, but it does not invalidate the point). If I
coded a wall paper of mutually exclusive IFs without ELSEs, present it
to my client, and they point out that this is inefficient, I cannot
retort that I am sorry, but this is the way I write code. If I do,
they will not pay me and hire someone else.
The way IFN/IFC are implemented today makes their utility very limited
even compared to if-then-else, for the latter must be used any time
performance is at stake or log junk is to be avoided. SAS have
definitely know how to write extremely efficient software. Let us hope
that what is "not the implementation today" will become implementation
tomorrow.
Kind regards
------------
Paul Dorfman
Jax, FL
------------
? -------------- Original message from "Howard Schreier <hs AT dc-sug
DOT org>" <schreier.junk.mail@GMAIL.COM>: --------------
> On Tue, 20 Jan 2009 16:01:06 -0500, Paul Dorfman wrote:
>
> >GuyA,
> >
> >It is not an either-or question. One just needs to select a tool
one needs,
> >and there is nothing specific about IFN/C that would violate this
rule. You
> >could fall prey all the same if you coded
> >
> >if missing(fields) then incomplete_data2 = 1 ;
> >else incomplete_data2 = . ;
> >
> >for in this case, too, the assigned value would be overwritten by 1
or .
> every time the next array element is encountered, and the result of
*this*
> >IF-THEN-ELSE is no different from dataifn(missing(fields),1,.).
You just
> >need to align your programming logic with your goal.
> >
> >My beef with IFN is different. I had fancied that one of big
advantages of
> >IFN would be the ability to recode, for instance,
> >
> >if nmiss (a) or not b then c = . ;
> >else c = a / b ;
> >
> >as
> >
> >c = IFN (nmiss (a) or not b, . , a / b) ;
> >
> >The intent here is of course to avoid "Missing values were
generated..."
> >and "Division by zero detected..." notes in the log, which I
intensely
> >dislike and never, ever consider a program finished before those
have been
> >completely eradicated. The division by zero is especially vicious
since it
> >is a performance hog.
> >
> >So, I had thought IFN would be at least as smart as if-then-else is
to
> >never even attempt to execute a/b if the first argument is true: it
would
> >merely assign c=., as prescribed by the second argument and that
would be
> >the end of it.
> >
> >But no. IFN executes a/b regardless of how the first argument is
evaluated,
> >and of course generates a "missing" note if A and/or B is null and
> >"division by 0" note if A is not null and B=0. In other words, the
> >underlying code logic is equivalent to
> >
> >temp = a/b ;
> >if nmiss (a) or not b then c = . ;
> >else c = temp ;
> >
> >Which begs the question "why?". Note that it does not follow from
the SAS
> >documentation that IFN is anything but equivalent to the
corresponding if-
> >then-else and even give an example supporting the identity. Nowhere
in the
> >documentation the (significant) distinction noted above is
mentioned.
> >
> >I suspect that this is a semi-bug. Since in the absence of the
fourth
> >argument, the first argument can be only either true or false,
there is
> >never a reason to evaluate argument-3 expression if argument-1 is
true or
> >argument-2 expression if argument-1 is false. The reason I have
halved the
> >bug is that, after all, IFN returns a correct result, albeit doing
more
> >(essentially harmful) work than needed.
> >
> >I cannot determine directly whether IFC also evaluates all 2 (or 3)
result-
> >arguments no matter what argument-1 evaluates to, for due to its
character
> >nature, it does not (thanks goodness!) produce extra junk in the
log.
> >However, it is easy to find indirectly that this is the case:
> >
> >190 data _null_ ;
> >191 length a $ 1000 b $ 2000 d $ 3200 ;
> >192 a = repeat ("0123456789", 99) ;
> >193 b = repeat ("ABCDEFGHIJ", 199) ;
> >194 c = "This is a test" ;
> >195 do i = 1 to 1e6 ;
> >196 do cond = 1, 0, . ;
> >197 d = ifc (cond, a || b, b || c, a || c) ;
> >198 end ;
> >199 end ;
> >200 run ;
> >NOTE: DATA statement used (Total process time):
> > real time 19.46 seconds
> > cpu time 12.87 seconds
> >201
> >202 data _null_ ;
> >203 length a $ 1000 b $ 2000 d $ 3200 ;
> >204 a = repeat ("0123456789", 99) ;
> >205 b = repeat ("ABCDEFGHIJ", 199) ;
> >206 c = "This is a test" ;
> >207 do i = 1 to 1e6 ;
> >208 do cond = 0, 1, . ;
> >209 if cond = 1 then d = a || b ;
> >210 else if cond = 0 then d = b || c ;
> >211 else if cond = . then d = a || c ;
> >212 end ;
> >213 end ;
> >214 run ;
> >NOTE: DATA statement used (Total process time):
> > real time 4.25 seconds
> > cpu time 3.14 seconds
> >
> >In light of the 4:1 run-time ratio, methinks evidence is pretty
strong...
> >so if one is programming not only for brevity and beauty but also
for
> >machine performance (when it matters) and/or need to avoid the
nasty SAS
> >log notes one has *programmed* to avoid, it is better to stay away
from
> >IFN/IFC... till the semi-bug is fixed.
> >
> >Kind regards
> >------------
> >Paul Dorfman
> >Jax, FL
> >------------
>
> I believe that expressions are evaluated from the inside out. Thus
each
> argument is evaluated before IFN itself gets to work.
>
> This has come up before. See http://tinyurl.com/9hbrae
>
> From a top-notch birdie:
>
> "They are implemented using the general purpose infrastructure used
to
> implement all functions (and thus available in DataStep, SQL, where
> ... at the same time). This infrastructure evaluates the arguments
> into a data structure, and then calls the function-implementation,
> expecting it to return the result. The surrounding datastep has no
> earthly idea as to the semantics of the innards of the function.
> That's just the way it (the general purpose infrastructure) works,
> that the function arguments are pushed onto a stack, and then the
> function evaluated - (perhaps you had a HP calculator that used
RPN)."
>
> "IFN(and cousins) perhaps should be more tightly bound into the
> expression evaluators, so they could selectively evaluate their
> arguments, but that's not the implementation today. Certainly some
> in the datastep do work like that (SUBSTR on the RHS of an
assignment
> - they are processed semantically by the datastep itself, and not as
a
> black box who gets fed arguments and is expected to cough up a
result."
>
> Try
>
> c = a / ifn(missing(b),(.),b);
>
> That should get rid of much of the crud in the log.
>
> >
> >On Fri, 16 Jan 2009 01:09:42 -0800, GuyA wrote:
> >
> >>Thanks all, that was useful. I will be more careful when using IFN
and
> >>IFC in future, and may just stick to good old IF THEN ELSE syntax!
|