LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 1997)Back to main REXXLIST pageJoin or leave REXXLIST (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 14 Nov 1997 15:37:50 -0600
Reply-To:     REXX Programming discussion list <REXXLIST@UGA.CC.UGA.EDU>
Sender:       REXX Programming discussion list <REXXLIST@UGA.CC.UGA.EDU>
From:         Doug Quale <qualed@MAIL.STATE.WI.US>
Organization: State of Wisconsin
Subject:      Re: Why REXX is not my favorite scripting language (was Re:
              regular expression matching)
Content-Type: text/plain; charset=us-ascii

I don't want to make this seem like a picky tit for tat thing where I try to shoot down anything anyone else has to say. There is one very interesting question about regular expressions in here that I did want to say something about, so I'll write about a few more things too.

pforhan@millcomm.com wrote: > > In <346B918A.4709@mail.state.wi.us>, Doug Quale <qualed@mail.state.wi.us> writes:

> >It is true that PARSE is a central point of REXX. Regular expressions > >are a central point of Perl and awk. RE's are more powerful than PARSE, > >hence Perl and awk are more powerful than REXX. If I weren't familiar > >with RE's I would probably think PARSE is wonderful, but by comparison > >to RE's, PARSE just doesn't carry the freight. > > I would dare say that REs and PARSE are almost different applications. > Maybe PARSE could be called "the working man's RE." In a way. But > their purposes are somewhat different. PARSE does more in the way > of, well, parsing; splitting something up based on characteristics > and positioning and what-not. REs are more for finding things.

In general this is true. RE's can also be used to split things up, however, and this is common in awk and perl.

> There is no question that REs are powerful. But they cannot do what > PARSE does, without extensions. To my knowledge, there is no such > syntax as "^%1.*%2$" (where %1, %2 are variables to be loaded...) > in any RE, is there? Come to think of it, I doubt you can do that > query exactly in REXX, either. Here, my lack of knowledge of both > REXX and Perl show through... so forgive me if I am really wrong.

No you are really right. The key is without extensions. I, like almost everyone else, use somewhat sloppy nomenclature by saying "regular expressions". In fact, what people usually think of as regular expressions are enhanced or extended RE's. And, there is no single enhanced RE design, but several rather many. Without going through all the detail (which I would in any case surely get wrong), I would just note that basic RE's would include only literal characters, alternation (or, symbolized by |) and concatenation (and, usually symbolized by juxtposition). That's it. From that you can build character classes, but the bracket notation ([A-Z]) is much easier to use. Negated character classes are possible without special support, but would be extremely clumsy and would suffer character length portability problems. The key enhancement is the closure operator (*) meaning 0 or more matches. With that you can get the 1 or more matches + operator and the curly braces {n,m} at least n but no more than m matches functionality. So everything but concatenation, alternation and closure is syntactic sugar.

One more extension is needed to parse with RE's -- we need a way to refer to pieces that we have matched. Unix tools that use RE's to specify substitutions do it by enclosing the matches to memorize in parens and then referring to them with \1, \2, etc. So, to reverse the first two words on a line you might say

s/([a-z]+) +([a-z]+)/\2 \1/

A similar use works for a programming language, so in perl you say

($first, $second) = /(RE for first match) stuff (RE for second match)/;

In fact in perl, you can do a whole lot more with RE's than that.

> >Personally I find other aspects of the syntax of REXX a bit annoying, > >particularly conditionals. I don't like the special cases involving the > >semicolon > > > > Well, my first recommmendation is to not to do this, at all. > Always split up if/then/else statments.

Good advice :).

> >The design of the REXX IF instruction also allows only a single > >expression in the THEN and ELSE clauses, requiring frequent use of > >DO...END blocks in conditionals. (This mistake is also in Pascal, but > > Well, many languages are like this, including C and C++. I prefer Ada, > which, IIRC, always forces the use of an END IF. Modula-2 is like this > also, no?

You are right that most languages get this wrong, including C. (C is not a language I would use as a model of good syntax. Except the expression syntax. C has wonderful expressions, but hideous variable type declarations.) Wirth fixed this problem between Pascal and Modula-2, and I'm sure it's ok in Oberon as well.

> > >REXX scripts accept only a single argument. That means that every REXX > >script must parse its own arguments, rather than letting the calling > >environment handle argument parsing. (This is a design flaw that was > > Not that it is hard, using parse and friends. Far easier than C, IMHO. >

The real problem here is that it can't be done reliably. If I invoke a Unix script like this

example "script should see two args" "but this is tough for REXX"

the REXX example script gets one argument, a concatenated string of the two initial arguments. There is in general no way to solve this problem -- REXX really believes that scripts can only get a single argument. This fits the command model of CMS and MVS just fine, but it loses in an environment that permits multiple arguments. Unix REXX scripts must either use a non-portable, Unix specific function to obtain command line arguments or the user must be careful to use extra quotes whenever invoking a script written in REXX (making invocation of REXX scripts different from any other program on Unix).

> >repeated in PCDOS/MSDOS.) I frown at the "GREAT RUNES" IBM mainframe > >mentality of uppercase translation by ARG and PULL not to mention > > As I am sure you know, you can use parse arg and parse pull to > eliminate this problem.

True, but the point is that the behavior of the short versions is poorly chosen. Instead ARG and PULL should not do case translation, and PARSE ARG UPPER and PARSE PULL UPPER should be used if you want to smash case.

> > >default values for undefined variables. I don't like the interface to > >external environments. Although sending unrecognized expressions to the > >external environment may have seemed like a great idea for a scripting > >language, I find it confusing and error prone. It also necessitates the > >odd CALL instruction. Therefore > > My suggestion would be to use one or the other consistently. If you like > communication with the OS, though, use the x=oscmd() form.

Unfortunately REXX doesn't have an x=oscmd() form. This may be supported in OS/2, but it's not a part of standard REXX and it won't work in MVS and CMS. In fact, Cowlishaw had envisioned that external commands would write their output to the external data queue and the invoking REXX program would use PULL to get at it. This is a poor, CMS-specific mechanism, and in many cases requires the external commands to be rewritten to work with REXX. The IBM MVS REXX implementation uses OUTTRAP to capture output from *some* external commands. It's clumsy, not portable and does not work in all cases.

> > >REXX does not provide a good means of communication with commands > >submitted to external environments. Although it is suggested to use the > >data queue for this purpose, the data queue itself is a strange idea > >that limits you to a single process environment. Unix backticks are > >infinitely superior in this regard. > > Why not just seperate the commands into seperate steps, as I am sure > unix shells do, anyway? > > /* contrived example below: */ > x=ls("s*illy") > y=cat(x) > /* someone correct my syntax here, it's been a while...*/

Unfortunately even if this works in OS/2, it isn't a standard part of REXX and it doesn't work in MVS. The Unix counterparts

@x = `ls s*illy` @y = `cat x`

work great in perl.


Back to: Top of message | Previous page | Main REXXLIST page