LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 6 Sep 2007 15:42:31 -0400
Reply-To:     Paul Dorfman <sashole@BELLSOUTH.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Paul Dorfman <sashole@BELLSOUTH.NET>
Subject:      Re: Zero observation subset used to populate a Hash ends DATA Step
Comments: To: "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>

Richard,

Using _N_=1 condition in this context is a matter of application; I eagerly use it myself when convenient. However, I am unsure how it may be related to your main issue: with an [unconditional] UNTIL in the second loop, the step will cease and desist either way.

Recapping the underlying DATA step mechanics and UNTIL/WHILE workings is certainly relevant (if only for the benefit of public education), and of course WHILE nails it on the head.

For that to be able to work, ENDF2 must have been set to TRUE before the loop begins, so logically, you explain it as "A SET statement inside a coded DO loop can give option connected temporary variables in the PDV their proper assignation without having program flow actually cross over the SET statement". Fair enough, but methought it was a bit convoluted and could be made a bit clearer. To wit:

1. The WHERE condition is evaluated at compile. If it returns no rows, the buffer for the data set instance is empty, hence the compiler sets 1<-ENDF2. This action depends only on the presence of the corresponding SET statement in the DATA step (i.e. it does not matter whether an explicit loop or anything else is there). The pudding:

data a ; retain x 1 ; run ;

data _null_ ; put eof = ; stop ; set a (where = (x=2)) end = eof ; run ; ------------------------------------------------------------- eof=1

2. Now any construct telling SAS "read FOO in a loop only if ENDF2 is false" will hit the target. WHILE, being specifically designed for the situation, is merely the most logical and concise way of attaining the goal:

data _null_ ; put '1: ' eof1= eof2=; do until (eof1) ; set a (where = (x=1)) end = eof1 ; end ; put '2: ' eof1= eof2=; do while (not eof2) ; set a (where = (x=2)) end = eof2 ; end ; put '3: ' eof1= eof2=; if not eof2 then do until (eof2) ; set a (where = (x=2)) end = eof2 ; end ; run ; ---------------- 1: eof1=0 eof2=1 2: eof1=1 eof2=1 3: eof1=1 eof2=1 1: eof1=1 eof2=1

So, one can keep UNTIL by making the loop conditional. I imagine it could be advantageous in some code-generating situations (as opposed to having the code generator replace UNTIL with WHILE in the middle of the phrase). It also tells that one had better make sure that EOF2 is not accidentally set to false by some SAS code preceding the loop, which would defeat the value set by the compiler.

(BTW, this sort of danger is absent from using IF instead of WHERE. The latter's biggest claim to fame being efficiency, primarily against a good index, it is not necessarily better in all situations - particularly when some query components work much faster in the PDV than in the buffer).

Kind regards ------------ Paul Dorfman Jax, FL ------------

On Thu, 6 Sep 2007 07:17:38 -0400, Richard A. DeVenezia <rdevenezia@WILDBLUE.NET> wrote:

>Paul Dorfman wrote: >> Richard, >> >> Instead of making SET hit on an empty buffer to begin with, you can >> resort to IF instead: >... >> DCL HASH F2(); >> F2.DEFINEKEY('J'); >> F2.DEFINEDATA('K'); >> F2.DEFINEDONE(); >> DO Z=0 BY 0 UNTIL (Z); >> SET FOO END = Z; >> IF I NE 2 THEN CONTINUE ; >> PUT I= J= K=; >> F2.ADD(); >> END; >... >> do z=0 by 0 until (z); >> set bar ; >> ** do stuff; >> end ; >> stop ; >> run ; > >The population of hashes prior to explicitly looping over a dataset is a >construct I am fond of as well, however not applicable in my real world >case. The code base in question deals with numerous hashes while applying >semantic meaning to quite a few star shaped data abstractions. Thus hash >population and usage thereof has to be close together in the code to help >maintain sensibility for future generations -- thus I _do_ need to use the >implicit loop and if _n_ = 1 constructs. > >The root issue is one of UNTIL vs. WHILE. >UNTIL is incorrect in this coding scenario, but I have been fixated on it >due to a spate of situations where a DO UNTIL was used for summarizing BY >group processing -- processing that never dealt with no row possibilities. >DO WHILE is the proper loop for my issue. > ><elucidation> >One must go back to basics... > >As shown in help page "How the DATA Step Works: A Basic Introduction", >{ data reading statement [a SET statement]: is there a record to read? } > if NO then close data set > if YES then trundle along > >Therein lies the subtlties of SET and loops. >A DO UNTIL goes through a loop once unconditionally. >A DO WHILE can conditionally never enter a loop. >A SET statement inside a coded DO loop can give option connected temporary >variables in the PDV their proper assignation without having program flow >actually cross over the SET statement. > >* set statement reached at least once; >do until (endUntilAssertion); > set XYZ(where=(<xyz-criteria>)) end=endUntilAssertion; >end; > >vs > >* set statement might never be reached; >do while (not endWhileAssertion); > set ABC(where=(<abc-criteria>)) end=endWhileAssertion; >end; > >When the criteria yields zero rows, the UNTIL while cross over the SET, >resulting in an immediate halt to the DATA Step; the WHILE will not enter >the loop and thus not cross over the SET, resulting in the desired flow. ></elucidation> > >> >> Now if you are running SAS9.2, it is getting much simpler and better. >> As privately promised by a birdie, the object is not smart enough to >> absorb not only data set name specification but also the data set >> options. Hence much simpler approach: > >This is a long time coming and welcome :) >Perhaps the same courtesy will be afforded to .OUTPUT() ? > >Coming full circle... The code now looks like > >-------- > if _n_ = 1 then do; > declare hash f2(); > f2.defineKey('j'); > f2.defineData('k'); > f2.defineDone(); > DO WHILE (NOT endf2); > set foo(where=(i=2)) end=endf2; > put i= j= k=; > f2.add(); > end; > put; > end; >-------- > >Richard A. DeVenezia >http://www.devenezia.com/


Back to: Top of message | Previous page | Main SAS-L page