DrDialog, or: How I learned to stop worrying and love REXX - Part 9

From EDM2
Jump to: navigation, search
DrDialog, or: How I learned to stop worrying and love REXX
Part: I II III IV V VI VII VIII IX X XI XII XIII XIV

By Thomas Klein

Sorry for the lack of last month's issue, but I was busy renovating our bathroom. Itching fingers and extreme fatigue in the evening hours kept me from typing. First let me mention, that there was some severe error in the previous issue concerning the say command:

And - by using a final example - using the command

say "The meaning of life is:"        mol

(with 7 spaces between the literal and the variable name) will result in an output of

The meaning of life is:        42

The number of spaces used in the say command is exactly the same that is used in the output formatting, which is not typical for programming languages in general AFAIK. Usually, the command is parsed into a keyword and parameters part only and the number of spaces between them is ignored.

This is complete rubbish I'm afraid. ;)

As my doubts were obvious by the last paragraph, they proved to be justified: Of course REXX parses the statement into keyword and parameters, then reassembles them. So the correct output in the above example should read...

The meaning of life is: 42

I could have stated this to be some sort of test <g>, but being honest I have to admit it was just an error... ;) Nevertheless, the other examples were right.

Now, don't ask me how come that I write such stuff or why I seemed to get such output from my own testing. I simply don't know. Obviously something went wrong that day. The proof readers either didn't notice the glitch as well or seemed to be too afraid to mention. We're all just humans after all and we make mistakes. Sorry for this one.

Program flow control

Today, we'll start to discuss the basic means of controlling the program sequence. Basically, there are three types of "flow":

  • Sequence (aka "batch")
    This is what we know already: One command after another
  • Condition (aka "branch")
    The next command (or block of commands) to be executed is decided upon a certain condition
  • Iteration (aka "loop")
    A command sequence (consisting of 1 or more commands) that is executed repetitively until/while a certain condition applies (or not)

Now let's have some examples of what this does mean:

A Sequence is what we already know from previous examples. The program flow is - graphically speaking - preceding from top to bottom, one statement (or command) is executed after another, like in

say "please enter your name (and press the return key)..."
parse pull name
say "Hello" name
letters = length(strip(name))
say "Wow - your name consists of" letters "letters."

Okay, that's a somewhat stupid program - but don't mind it. Its not mandatory to understand what length(strip(name)) does - just notice that this is a typical batch process made of various commands (inputs, outputs, functions). The flow of execution begins with the first statement and ends with the last one. Then, the program is terminated.

The meaning of a conditional expression in terms of program sequence is pretty much the same like you coming down a hallway that ends up in front of two or more doors: Depending on a certain decision, you'll precede to walk through only one of them. You can't walk through more than one at the same time. To give you an impression of what this might look like:

say "please enter your name (and press the return key)..."
parse pull name
say "Hello" name
letters = length(strip(name))
say "Wow - your name consists of" letters "letters."
if letters < 5 then
   say "...which is quite short in fact."
else
   say "...and that's rather long."

This means, that the branches are mutually exclusive.

In brief, the basic syntax of the IF command reads

IF <conditional expression is true> THEN <command>

The important fact is that both the command keyword "IF", the conditional expression and the "THEN" keyword need to appear on the same line for the rexx parser to understand what you are telling it to do. The command to be executed can be written on the same line or the next non-empty line. That's to say

IF age < 20 THEN say "you're quite young."

and

IF age < 20 THEN
 say "you're quite young."

and

IF      age < 20 THEN
      SAY "you're quite young"

...are all valid notations with the same result. In addition, the case of the keywords doesn't matter. Within that scope, you're even free to space the single words as much as you want, like:

IF     age  <     20 then
  say     "you're quite young."

(Note: the "then" was typed in lower case)
Now let's extend the syntax discussion by adding the ELSE stuff:
ELSE is used to denote a command to be executed, if the condition does not apply. Thus, ELSE can be thought of as a "separator between the branches". As ELSE is an optional part of the IF command, you don't need to code it if you don't want to do something special with those cases that won't fit into the other branch. If there is no ELSE, there is nothing done about them.

IF <conditional expression is true> THEN <command> ELSE <command2>

But hold it. If the whole thing is written on a single line, the parser might not get the point, like the following example might show:

IF age < 20 then say "you're quite young." ELSE say "you're not THAT young any more."

If the variable "age" is set to contain a value less than 20, the output would be:

you're quite young. ELSE say you're not THAT young any more.

This is because the parser doesn't resolve the stuff "behind" the SAY command. To clarify, you either need to add a semicolon

IF age < 20 then say "you're quite young." ; ELSE say "you're not THAT young any more."

or (the "better" way) you insert a line wrap after the first "command":

IF age < 20 then say "you're quite young."
  ELSE say "you're not THAT young any more."

Talking about multiple lines - the ELSE keyword and it's corresponding command can be written on the same line or separated by one or more empty lines (again, as much spaces involved as you want):

IF age < 20 then say "you're quite young."
  ELSE
    say "you're not THAT young any more."

To complete the examples in this stage here's the way that you should prefer in order to achieve better readability:

IF age < 20 then
  say "you're quite young."
ELSE
  say "you're not THAT young any more."

This shows the two basic rules that apply:

  • The "IF" keyword, the conditional expression ("age < 20") and the "THEN" keyword are on the same line
  • The "ELSE" keyword is on a different line than the first command

Be careful when trying multiple-line statements from within REXXTRY, as REXXTRY passes each single line directly to the REXX parser for execution, once it was entered. This means that the ELSE part is not seen "in conjunction" with the previous stuff and thus will result in a syntax error.

Let's summarize what we've learned about IF so far in matters of syntax:

IF <conditional expression is true> THEN
  <command>
ELSE
  <command2>

Okay, let's move on: There's one more issue with the IF in REXX that you might not be aware of - especially if you have already done programming in another high-level language such as Basic or COBOL. REXX only accepts one single command within its branches. Except that you take care to explicitly name a block of commands by enclosing it in DO and END commands. Thus, the extension to the above syntax is that <command> or <command2> can be replaced by a structure of

do
 command-2
 command-3
 ...
 command-n
end

DO and END simply denote multiple commands to appear as a single block of statements to be included... to "make the parser understand".

As a DO...END block is treated equally to a single command (from the parsers point of view), you're free to even "mix it to your needs":

IF <conditional expression is true> THEN
  DO
     <command-1>
     <command-2>
  END
ELSE
  <command-3>

or "the full monty":

IF <conditional expression is true> THEN
  DO
     <command-1>
     <command-2>
  END
ELSE
  DO
     <command-3>
     <command-4>
  END

To complete the variety of notations, there is one thing to note when using the "DO" keyword on the same line as the first command of the block (which you're free to do). You need to separate them by a semicolon. Okay, let's keep it short - the following code parts all have the same effect and meaning:

  1. IF <conditional expression is true> THEN
  DO
       <command-1>
       ....
  END

  1. IF <conditional expression is true> THEN DO
      <command-1>
      ....
      END

  1. IF <conditional expression is true> THEN DO; <command-1>
       ....
       END

  1. IF <conditional expression is true> THEN
   DO; <command-1>
       ....
      END

Phew! I know it's getting hard... are you still with me? ;)

The best way of getting through this is to keep in mind the first of the samples above: "THEN, DO and <command> on a line of its own each". It will both look and read better!

Now to completely confuse you let me mention that a block of commands enclosed by DO/END can of course contain additional IF...ELSE... structures :)
Okay, this might become puzzling at first sight, but hey: Basically this works just like an onion: Each IF/THEN/ELSE on its own is just like a single layer. And if you look at them on a per-layer view, they are all quite simple. No magic.
...

Just imagine that hallway which ends in front of two doors. Of course, after walking through one of them, there could be more doors again - this is what we call a "nested IF structure" - but basically, as I said, it's made up of the same parts again and again. Example:

say "please enter your name (and press the return key)..."
parse pull name
say "Hello" name
letters = length(strip(name))
say "Wow - your name consists of" letters "letters."
if letters < 5 then
   if letters < 4 then
   do
          call beep 500,100
          say "...which is really short, man!"
          end
   else
          say "...which is quite short."
else
   say "...and that's rather long."

Well okay - from the "meaning" point of view, that's a rather stupid program, but just to make you see: These are two IF statements "nested". The first one (the "outer one") has two branches (as always), but one branch contains another IF (an "inner" one). And this one uses a DO/END block within its own first branch by the way...

Now let's see what I meant when saying "two or more doors":

The IF statement consists of an expression which is checked to be true. To make sure you get the point: The expression (regardless of how complex it might be) is either TRUE or NOT (that's to say true or false). So an IFStatement will provide you with "two doors" only. Take a look at the example above. If we would like to differ between "quite long" and "really long" names, it would mean to add another IF statement into the second branch of the "outer" IF... and sooner or later the whole thing will turn out to become difficult to read and might ease the creation of errors as it grows and grows.

Of course this is not a "wrong" way of programming, but maybe not the "best". We might need to check if there aren't other means of doing what we like to do. The case is that we don't want to check if something is true or not, but rather react upon different states that might happen to take place. In our (silly) example, we actually want to differ between several lengths of possible names. This can be accomplished by either creating a complex structure of nested IF statements (like we said above) or by using REXX's select statement:

say "please enter your name (and press the return key)..."
parse pull name
say "Hello" name
letters = length(strip(name))
say "Wow - your name consists of" letters "letters."
select
 when letters < 4 then
       say "...which is really short, man!"
 when letters < 6 then
       say "...which is quite short."
 when letters < 9 then
       say "...which is quite long."
 otherwise
      say "...which is really long, man!"

This looks like - yeah! - what we've been searching for and actually is the solution to our problem, but it needs to be handled with care. What select does is checking the conditions from top to bottom and branches into the first one (within that order) which applies.

This means, that you need to keep in mind what it does when writing your cases... the following statement looks okay at first sight, but there's a minor logical glitch:

select
when letters > 3 then
     say "...which is quite short, man!"
when letters > 5 then
     say "...which is rather long."
when letters > 7 then
     say "...which is really long."
otherwise
     say "...which is really short, man!"
end

The problem is, that if the variable letters contains a value of 6, the first case (> 3) applies too... thus, the first branch will be executed and that's it. The second condition (>5) is true as well, but program flow won't come to this point. The otherwise keyword by the way defines a branch that is to be executed if none of the previous conditions applies - some sort of "else" in the select structure if you want.

Another special thing is to mention when talking about rexx's select statement. You might know similar constructs from other programming languages, like the "evaluate" clause of COBOL or the "select case..." from Basic. These are quite similar in fact, but they only refer to different states of the same conditional expression. That's to say, the conditional expression itself is defines only once (at top of the statement) and then the branches refer to different states. Example from COBOL:

EVALUATE LETTERS
  CASE IS < 4
    DISPLAY "...which is really short."
  CASE IS < 6
    DISPLAY "...which is quite short."
 ...
END-EVALUATE.

Nifty COBOL-programmers would have rather created a set of suited 88-levels below the "letters" variable in order to evaluate the truth of them, but that's another story... ;)

The difference with REXX's select statement is, that the condition to be tested is defined in each branch, thus, each branch can be used to test something completely different, whether this makes sense or not. Usually, you might want to differ between different states of the same expression (like different values of the same variable) but not test different things that have no "common ground" in matters of logic. There might be cases where it could be useful, but it's difficult to both read and understand the meaning behind it at first sight, like

select
when age > 55 then
       say "As a software developer you should remember punch cards, do you?"
when name = "Peter"
      say "Hey, that's you again?"
when date('S') = "20040101" then
      say "Happy new year!"
end

This is some classic "argh!" construct that'll raise quite a few questions about what the heck it should be used for. Combinations like this one are usually nonsense, because one or more of the conditions could be true at the same time. But as you remember, only one door in your hallway can be passed through at a time...

So what if "Peter" was entered, it's age is 57 and this program is run on 01-Jan-2004?

Each condition applies, but only the first branch will be executed... in order to be able create programs with (at least) quite "predictable behaviour" you should try to avoid such stuff. In these cases, a different coding is preferable - like separated, consecutive IF statements. To conclude: Internally, the SELECT statement is handled just like a compilation of single IF statements. That's why the conditional expression is declared for each branch again. Keep in mind to take care of it when using it - you'll get a sore spot for implementing logical errors otherwise...

And finally, here's for the syntactical issues of the SELECT command (instead of some cute diagram again):

  1. The SELECT command needs to be terminated with an "END"
  2. The SELECT keyword itself can be on a single line or within the same line as the first complete "WHEN/condition/THEN"-sentence
  3. The WHEN keyword and the conditional expression and the THEN keyword need to appear on the same line (just like IF/.../THEN)
  4. The <command> can appear on the same line as WHEN/.../THEN
  5. The <command> can equally be a DO...END block (just like with the IF command)
  6. There is no ELSE within a branch
  7. The OTHERWISE keyword (together with its corresponding <command> or DO...END block) is optional
  8. As always, the number of spaces between the "words" doesn't matter, neither does the case (upper/lower/mixed) of the "words"

One more thing about that DO/.../END block: If used, the DO keyword can be written alone on a single line or together with the first command. In the latter case, you'll need a semicolon again to separate DO from the first <command>. This also applies to specifying them on the same line as the WHEN/.../THEN stuff. Thus, the following parts of code are all valid notations and work the same way:

a.)

SELECT
 WHEN age > 55 then
  DO
     call beep 100,500
      say "you're older than me!"
   END
 WHEN...

b.)

SELECT
 WHEN age > 55 then DO
      call beep 100,500
     say "you're older than me!"
       END
 WHEN ...

c.)

SELECT WHEN age > 55 then DO; call beep 100,500
   say "you're older than me!"
   END
 WHEN ...

d.)

SELECT WHEN age > 55 then
   DO; call beep 100,500
      say "you're older than me!"
   END
 WHEN ...

Again, the first one (a.) is the one you should prefer to use: No hassle with semicolons, better readability, well... it depends on your "taste" in matters of self-explaining source code. The parser only checks for rules and doesn't care for stuff like indention - that's up to humans only... ;)

NOP

To avoid ending this months part with nothing useful, let me present you with a "useful nothing" instead ;) - the NOP command. You might know that instruction from assembler and yes - the REXX counterpart does exactly the same: Nothing. Now isn't that funny? A special command used to "do nothing"... now "what the heck should that command be used for?" you might ask yourself... well, that's easy: Nothing. ;) It is not used to "do" anything but to "fill gaps" in visibility of logical structures.

NOP comes in handy whenever your nested, conditional expressions might end up somewhere with a logical branch that actually isn't needed, because you just want to take care of the "other" branch.

As we stated above, the ELSE branch of an IF statement is not mandatory, so you actually are free to simply not code it... on the other hand - especially when dealing with complex nested conditions, the source code simply might "look clearer" if each IF has its ELSE. Even if that ELSE simply does nothing. To give you an example:

if age > 37 then 
  say "You're older than me!"

By using NOP this would look like this:

if age > 37 then
  say "You're older than me"
else
  NOP

Now, if someone else is required to work with your code (or you are taking a look at it, months after you initially wrote the program) there is no need any more to figure out if the ELSE was omitted by error or simply neglected... it's quite clear: ELSE NOP will show that expressively NOTHING should happen if the condition does not apply. Of course, NOP can be used with SELECT as well. Like in...

Select
   when a > b then
        say "There is a difference of" a - b
   when a < b then
            say "There is a difference of" b - a
   otherwise
       nop
end

This code is almost self-explaining, there is no question arising from what is coded. Some programming languages will force you to think twice about how to construct your conditional expressions in order to achieve your goal, but wherever you'll have a NOP at hand, you simply don't need to care that much. ;)

Okay, that's enough code work for today! If you did manage to arrive down here without your head spinning - well, Sir, I'm proud of you. ;)

Next month, we'll talk about the third kind of flow type - iterations (or "loops" if you prefer) - and how they are done in REXX.

References:
GuiObjectREXX Yahoo! group: http://groups.yahoo.com/group/GuiObjectREXX/
News group for GUI programming with REXX: news://news.consultron.ca/jakesplace.warp.visualrexx
Download from Hobbes: DrDialog_3-27.zip
IBM Redbook - "OS/2 REXX: Bark to Byte"
Code sample - http://www.os2voice.org/VNL/past_issues/VNL0303H/cntsamp2_en.zip