LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2002, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 25 Nov 2002 22:11:29 +0000
Reply-To:     alejandro.jaramillo@ATT.NET
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         alejandro.jaramillo@ATT.NET
Subject:      Re: Set statement bug?

Ladies and Gentleman, I find the information exchange about the set statement very interesting. Please don't get personal, keep a good spirit and let's move on.

These is my one cent.

Alejandro > On Fri, 22 Nov 2002 13:10:49 -0500, Ian Whitlock <WHITLOI1@WESTAT.COM> > wrote: > > >Paul, > > > >You have a style of making pronouncements as if they were true. I have > >examined some of them below from my point of view. I ask each reader to > >draw his/her own conclusions about both points of view and then to > >continually reconfirm them as new programming experiences dictate. All of > >my paragraphs begin with ***. > > > > > > > > > Perhaps a better way to phrase that is "I have a style of sticking my foot > in my mouth" (which anyone who knows me would shrug and say, "yeah...") > > When I started posting on this "multiple SET statement" stuff, I really did > not realize that most of the people on this board actually know how SAS > works. I unfortunately spend most of my day showing people how to use PROC > CONTENTS and PROC SORT (with the occasional PROC FREQ). My first reaction > is to keep anyone from doing anything that they can't handle, and to play > defensive on the use of SAS software. > > > > >"Unpredictable" certainly was the incorrect choice for a word, because SAS > >is always predictible if you know everything. But then if you knew > >everything, you wouldn't need to write a SAS program to analyze the data! > > > >HOWEVER, I would still stray away from using multiple SET statements in the > >same data step for the following reasons: > > > >1) Observations from the end of one or more input data sets will be > >deleted from the output data set unless all input data sets have the same > >number of observations. > > > >*** False as shown by > > > > data all ; /* eofa and eofb are not on a or b */ > > do until ( eofa ) ; set a end = eofa ; output ; end ; > > do until ( eofb ) ; set b end = eofb ; output ; end ; > > run ; > > > Yeah, if you put other conditions in data step. I posted my log on my test > job for this earlier, and the two methods of coding produced different > results. > > Here's a question: why would you want to do that when you can have just > one SET statement with two data sets and avoid the DO UNTIL loops? > Practical example please. > > > >2) Combining several data sets into one data set with multiple SET > >statements mimmicks a merge, but each input data step may not be in sorted > >order and is not required to be in sorted order. > > > >*** This is a rather limited point of view. How would you perform the DATA > >step of the previous paragraph with a MERGE? If your answer is that you > >wouldn't, then that is a contradiction to the claim that the code mimics a > >MERGE. > > I would not, and my answer is that, in the example above, I would do this: > > /* example 1 */ > data all ; > set a b ; > run ; > > /*or this, example 2 */ > proc append base=all data=a ; > run ; > > proc append base=all data=b ; > run ; > > And that could be sped up even faster if I could skip the first PROC APPEND > since all it is doing is copying WORK.A to WORK.ALL (but the specs might > not allow it). > > And then I would say that Okay, you're right. In the example you chose, > the multiple SET statements do not mimmick a merge. However, in the > original exmaple given that started this thread, the attempt does mimmick a > merge. If you want to discuss it further, great--we can have another > example and more detail in the topic. > > > >*** The choice of your words suggests that there is something wrong with > >mimicking a MERGE. However, your "but" clause suggests that there are > times > >when mimicking a MERGE is most appropriate. What is being claimed here? > > Change the "but" to an "and" if you like. > > > >3) Combining several data sets into one data set with multiple SET > >statements mimmicks a merge, but each input data step may not have matching > >keys (even if it is in "sorted" order). > > > >*** The choice of your words suggests that there is something wrong with > >mimicking a MERGE. However, your "but" clause suggests that there are > times > >when mimicking a MERGE is most appropriate. What is being claimed here? > > Please see above. > > >4) Combining several data sets through one data step and out to multiple > >data sets runs the risk of multiplying the issues above. > > > >*** In view of my thoughts on 1)-3) I am confused about what is being > >multiplied. > > > What I have found to happen is that users who make errors in coding earlier > in the program can compound those errors later in the program, and these > errors in the final output data get multiplied as the program introduces > more data. The more times data is manipulated, the more chances there are > for errors. > > > > >5) From my experience, the intent that most users have in mind when using > >multiple SET statements in the same data step is better and more > >efficiently resolved using MERGE statements, PROC SQL. and/or other data > >manipulation tools. > > > >*** Please note this is a statement about your experience. Although I > >cannot question it, it indicates that you may have met rather limited > users. > > CLIENT: Paul, How do I print a SAS data set? > PAUL: PROC PRINT. > CLIENT: Thanks! > PAUL: That'll be $10,000 please. > > I've "MET" many powerful, wonderful, and highly intelligent SAS users. > I've even worked on projects where we used cutting edge SAS products, > including the first US site for Risk Dimensions (I was the first SAS > Quality Partner in North America with RD experience, and the second NA RD > license). Unfortunately most of my work is indeed very simple SAS code, > and I'm usually working with people who have an entire 3 days of SAS > experience. > > >To the extent that it is a claim about SAS, it appears to be a repetition > of > >2) or 3) now allowing mimics of a MERGE with SQL. > > Probably, now that I look at it. > > > >6) Multiple SET statements in one data step could lead to the overwriting > >of variables with the same name, rather than appending new records > >(or "creating" new records) as when several data sets are used in one SET > >statement. No warning will occur in the log if this happens. > > > >*** True. However, one assignment statement can overwrite another without > a > >warning in the log. Does this mean one should abandon assignment > >statements? I think a better point of view is that programs are dangerous > >when you don't know what they are doing. Consequently you should find out, > >rather than abandon what might be an important programming technique. > > Of course not, it just means that programmers should be aware of it. > > > > >7) If any of the SET statements are inside a condition, then the value of > >the last record read will be retained for the remaining records until the > >first time one of the input data sets encounters the end-of-data marker. > > > >*** Is this a claim. If so, then is it for or against? > > ???? > > > > >8) If one of the SET statements declares a data set with zero records, > >then the resulting data set output will provide the n-1 records from the > >iteration in which the "zero-record" data was called (it could be > >conditional). > > > >*** I don't understand what is being said. If it means that there is > >something wrong with conditional SET statements to empty data sets, then I > >think that wrong because I sometimes find such statements very useful in > >determining the logical PDV and the attributes of input variables. > > Cool Idea! Thanks! > > > > >9) The SET statement wasn't designed to occur more than once in a data > >step. Just because it CAN be done does not mean that it SHOULD be done. > > > >*** The first statement is a claim about history that I do not have access > >to, but it is contradictory to some of the SAS Institute published material > >and consequently requires some form evidence before one can accept it. > > Such as? I'd love to read it! (and I'm not doubting you) > > > > >*** I agree with the second statement, and suggest that the "it" can be > >replaced by anything that can have a CAN and SHOULD context, since the > >statement is really about the relationship between CAN and SHOULD. > However, > >the statement does not say anything about what SAS code restrictions one > >should follow, other than to possibly mean that not all valid SAS code > >should be written. > > > >10) Multiple SET statements in one data step are confusing, outside > >standards, and overly challenging to support in production code. > > > >*** There are three unsupported claims here. I find all of them suspect > and > >dependent on who is confused, who makes the standards, and who supports the > >production code. > > > >Of course, the aforementioned "IF _N_=1 THEN SET" routine is an exception > >that is well-documented and supported by SAS, and I would exclude it from > >these 10 points. > > > >And, I suppose, someone could come up with a practical use for these issues > >and call it a "feature" ... > > > >However, I recommend avoid it. > > > >*** Recommending avoidance of a class of SAS programs restricts what kinds > >of programs can be written and consequently the programming ability of > >anyone following that recomendation, hence I consider it very important to > >present evidence when making such a claim. I also see it as an important > >obligation to point when I disagree with such claims. > > > > The more I read through your note, the more it seemed to me that you felt I > was either making a personal jab at you or that you wanted to destroy > anything that I said, as though it were some kind of personal vendetta. I > didn't mean to start a fight, just to prevent beginners from running before > they could walk. > > I've found some neat ideas here, and I'd forgotten about the _IORC_ feature > with two SET statements. However, most of the remaing examples that I've > seen I cannot think of a practical application where I would use it, and I > eagerly await other such input. > > Cordially, > > Paul McDonald > SPIKEware, Inc. > ------------------------------ > Free SAS Tutorials and Newsletter >

Back to: Top of message | Previous page | Main SAS-L page