How to Do It in OS/2 REXX

From EDM2
Jump to: navigation, search

By Charles Daney

Many questions about OS/2 REXX that occur in customer support calls and online REXX forum discussions actually relate to features and capabilities of the operating system rather than of REXX itself. The following represents some of the questions that arise most often.

Make a string all upper case or lower case

Uppercasing a string is easy. In fact, it can be done in a couple of different ways:

string = 'the owl and the pussycat'
up_string = translate(string)
parse upper var string up_string

The TRANSLATE function is actually capable of performing many interesting substitutions on a string, but the default when it is used with only one argument is to convert all alphabetic characters to upper case.

Lowercasing a string is a little harder. There is no analogous LOWER modifier on the PARSE instruction. In order to use TRANSLATE you must specify the upper case and lower case characters explicitly:

lower = 'abcdefghijklmnopqrstuvwxyz'
low_string = translate(string, lower, upper)

Note that the characters to be replaced appear as the third argument, and the characters they are replaced with appear as the second argument.

A slight simplification can be achieved by using the XRANGE function to create strings of upper and lower case letters:

upper = xrange('A', 'Z')
lower = xrange('a', 'z')
low_string = translate(string, lower, upper)

However, this is not recommended because it is not portable. Even though it works with ASCII data, since the internal codes for upper case and lower case letters are contiguous, it does not work with other encodings such as EBCDIC.

Trap ctrl-break and ctrl-c

Normally the execution of a REXX program can be halted by pressing ctrl-break or ctrl-c, just as for most other non-PM programs. This can be a problem, however, if the program has a need to finish what it is doing or otherwise clean up after itself before it terminates.

It is very easy to protect a REXX program against inadvertent or premature termination. In REXX terminology, the action of pressing ctrl-break is said to raise a condition called HALT. The default behavior of a REXX program when HALT is raised is to terminate, but it is possible to alter this default by providing a handler for the HALT condition and enabling it with a SIGNAL ON or CALL ON instruction.

In the simplest case, you may just want to perform some normal clean-up activities, such as erasing work files:

signal on halt
/* normal program logic */
/* halt condition handler */
call sysfiledelete tempfile1
call sysfiledelete tempfile2
say 'Bye-bye!'

But in other cases, you may want to prevent accidental (or intentional) termination of the program. In this case you would probably enable the HALT condition handler with CALL ON instead of SIGNAL ON. The reason is that this allows the condition handler to return to the exact point in the program at which the condition was raised. The handler might do nothing else besides this, or it might perhaps give a useful message about the current state of the program:

call on halt
do case_number = 1 to case_total
/* halt condition handler */
say 'Program is not interruptible at this point.'
say 'Now processing case' case_number 'of' case_total'.'

One problem you should be aware of is that there is a bug in CMD.EXE which prevents REXX from being notified if ctrl-break or ctrl-c is used during the execution of a system command. CMD.EXE will immediately terminate the whole REXX script if it detects a ctrl-break while processing a system command.

In the case of .EXE files run from a REXX program, what happens after pressing ctrl-break depends on how the .EXE file itself handles the signal. It is possible for the .EXE file to handle such interrupts itself, in which case REXX may never know about it and the REXX program will continue to run. Or, the .EXE file may let CMD.EXE handle the signal, in which case both the .EXE and the REXX script will be ended.

Pass an array to a subroutine

Passing an array (or a general compound variable) to a subroutine or function is something one frequently wants to do in order to operate on the array as a whole, e. g. to sort it, or to easily process all elements of a large collection. Unfortunately, it is not so easy to pass an array to a subroutine in REXX as it is in other languages.

What you can do is to pass the "name" of the array to the subprocedure. That is, the stem of a compound variable. The first problem that arises is how to refer to the compound variable within the subprocedure. It is a problem because REXX does not perform substitution for the stem part of a compound variable name. Therefore, the following doesn't work:

call mysub 'data', count
parse arg array, n
do i = 1 to n
    if array.i < 0 then
        array.i = 0

Here REXX will refer to variables named ARRAY.1, ARRAY.2, etc. instead of DATA.1, DATA.2, etc. as intended. What you need to do is use the VALUE function to access the arrays. VALUE can be used both to read and write elements of the array:

call mysub 'data', count
parse arg array, n
do i = 1 to n
   if value(array'.i') < 0 then
       call value array'.i', 0

Another problem arises if you begin the subroutine with the PROCEDURE instruction, which is normally good programming practice. This ensures that the subroutine can't inadvertently modify variables it isn't supposed to. Unfortunately, PROCEDURE also protects the data in the array which the subroutine is trying to access. This compound variable should be exposed - but since stem name is itself a parameter to the routine, there isn't any way to do this.

The only thing that can be done here is a kludge, based on the fact that you can expose a list of variables in a PROCEDURE statement by keeping the list in another variable. If you know that there are only a limited number of array names that will be used by the subroutine, you can put then all in one variable:

names = 'x1. x2. x3. x4. x5. x6. x7. x8. x9. x10.'
call mysub 'x3', count
mysub: procedure expose (names)

Here the stem names have all been listed in the NAMES variable (each ending with a period to indicate it is a stem). The NAMES variable and everyting listed within it is exposed by PROCEDURE EXPOSE, since it is enclosed in parentheses.

If you don't know in advance which array elements (or tails of a general compound variable) might be used, there isn't much choice but to pass each stem name in a list contained in the value of another variable which is reserved for the purpose:

stemnames = 'data. list. things.'
call mysub stemname, count
mysub: procedure expose (stemnames)

Even this roundabout method breaks down if you need to pass an array to an external procedure (in a separate file). In that case, it just isn't possible to do it, since external procedures implicitly begin with a PROCEDURE statement in which nothing is exposed, and it is not legal to use an explicit PROCEDURE EXPOSE. In this case, you must adopt some other technique, such as passing the data in a file, through the external data queue, in a list contained in a single variable, etc.

As long as you are working with arrays that are represented as compound variables each of these lastter techniques requires you to access each element of the compound variable and make a copy of it, which means really abandoning the natural usage of REXX compound variables. Further, this means you can only work with compound variables that are integrally subscripted (as opposed to arbitrary string "subscripts") - unless you go to even more trouble.

For instance, you might work with a list of things contained in a separate string variable, delimited by blanks, e. g.

names = 'Allison Becky Colleen'
likes = 'chocolate flowers poetry'

"Arrays" of this sort are, of course, easy to pass to a subprocedure. And there is the additional advantage that there are a number of REXX built-in functions that are specialized to work with such blank-delimited lists of words (WORD, WORDPOS, SUBWORD, etc.). Such lists can be quite long, since there is no particular upper limit to the length of a REXX string, but long lists can be very inefficient (i. e. slow) to work with. Also, of course, the individual items in the list can't contain embedded blanks.

Return an array from a subroutine

The problem of returning an array from a subroutine is really just the same as the problem of passing one in. Indeed, if you pass the name of a stem to the subroutine and expose it as described elsewhere (how to pass an array to a subroutine) then whatever changes you make to the compound variable are still in effect after the subroutine returns. You might choose to pass the name of a compound variable which will be used only for output in order to get the effect of returning a collection of values.

Another technique you can use, especially when an external procedure is involved, so that you can't expose variables, is to pass data through the external data queue. You can even adapt this technique to return data for a general compound variable (i. e. one that doesn't have a simple positive integral tail).

Suppose that one variable called TAILS. holds the list of tails of some other compound variable, call it VAR. Suppose also that TAILS.0 contains the number of such tails. The following code might be used at the end of a subroutine to place values on the external data queue:

do i = tails.0 to 1 by -1
   temp = tails.i
   push temp var.temp
push tails.0

In this example, we have intentionally placed things on the stack in reverse order ("LIFO") with PUSH so that if the stack already contains data, the existing data will not be disturbed. The last thing placed on the stack is the number of data elements just placed on the stack. Each data item on the stack contains both the tail and the corresponding value of the compound variable.

After the subroutine returns, this data can be retrieved from the stack with something like this:

pull taillist.0
do i = 1 to taillist.0
   parse pull tail value.tail
   taillist.i = tail
taillist.0 = count

Each item on the stack contains both the tail name and the corresponding data. These can be retrieved simultaneously with the one PARSE PULL statement - provided no tails contain embedded blanks. We have also assigned the tail values to a new array (TAILLIST.) in order to be able to keep track of them and (perhaps) iterate through them later.

Format numbers neatly for output

REXX makes it easy to display output without need for concern about data types and the complex formatting statements of languages like FORTRAN and C. However, when one wants to display data in a structured form like a table, there isn't much alternative to supplying some formatting information.

If all you want to do is display data in a table, the RIGHT and LEFT built-in functions can be used, according to whether you want data right or left justfied within a column. It doesn't matter whether the data is all numeric, all character strings, or some combination. For instance, if you want to produce a table of countries, capital cities, and populations, you might use code like this:

say left('Country', 15)||left('Capital', 15)||left('Population', 10)
do i = 1 to n
    say left(country.i, 15)||left(capital.i, 15)||right(population.i, 10)

The output might look like this:

Country        Capital        Population
Austria        Vienna            7500000
Denmark        Copenhagen        5118000
Switzerland    Bern              6289000
United Kingdom London           55883100

In this example we used an explicit concatenation operator. If you use implicit concatenation, don't forget to take into account the single blank that is inserted.

When you are producing formatted output involving numbers, the number of decimal places is frequently of concern. In addition, you may want control over whether or not exponential notation is used.

The TRUNC built-in function is the simplest way to specify how many decimal places should be displayed. The first argument of TRUNC is a number, and the second is the number of decimal places. Using a value of 0 for decimal places (the default) is also the easiest way to obtain the integral part of any number. As the name implies, TRUNC simply truncates a number. It does not perform rounding. If you request more decimal places than are present in the number, the remainder are set to 0.

If you want even more control over the display format of numbers, you can use the FORMAT built-in function. This allows you to specify the number of digits before, as well as after, the decimal point. For instance, a table of powers might be done like this:

x = -3.162
do i = 1 to 8
   say format(x**i, 5, 4)

And the results would look like this:


If you are working with numbers in exponential form, there are additional optional arguments of FORMAT that allow you to specify the number of digits to be used in the exponential part, and the number of total places required in the number in order to trigger its representation in the exponential format.

Write a line without a carriage return

The REXX SAY instruction always adds a carriage return and a linefeed at the end of any output you specify. (This is equivalent to the use of the LINEOUT built-in function when the first argument is omitted.) The effect is to move the cursor to the beginning of a new line on the screen.

If you want to avoid this new line effect, simply use the CHAROUT built-in function. You might want to do this to display several items using different REXX instructions, but have the output appear on the same line. Or you might want to display a prompt and have the typed input be on the same line. This could be done with:

call charout , 'Enter the number of your selection: '
parse pull select

Store data in a program

Most programming languages provide explicit mechanisms for including large amounts of data within a program - either character strings or numbers. This is typically done by special syntax for initializing arrays. REXX does not have anything comparable to this.

The normal way of handling this is simply to use a number of assignment statements:

msg.1 = 'Try again.'
msg.2 = 'Sorry, wrong answer.'
msg.3 = 'Input not understood.'

This isn't necessarily as bad as it looks, since there is little overhead in performing assignments like this. You can put all such statements in a subroutine at the end of the source file to get them out of the way, and just call this subroutine when the program starts.

There is somewhat more of a problem when you want to use compound variables with non-numeric subscripts, since you can't use literals to specify a compound variable tail. What you wind up with is something like this:

x = 'Allison'
color.x = 'cyan'
x = 'Belinda;
color.x = 'magenta'
x = 'Cyndie'
color.x = 'yellow'

There is one trick you can use to avoid using a series of assignments when you have to incorporate a large amount of textual data - for instance program help text. Simply enclose the text in a comment, and use the SOURCELINE built-in function to process it or copy it into an array. You must begin the comment at some known line of the program, or else you can determine the current line number with the SIGNAL instruction.

signal around /*
  This example shows you how to initialize an array
  with several lines of text read from the program itself.
  You might use this to display help information.
j = 0
do i = sigl + 1
   line = sourceline(i)
   if line = '*/' then
   j = j + 1
   array.j = line

There is one serious potential problem with this technique: SOURCELINE may not work if the program is tokenized and the source is discarded. This can happen if the program is saved in a macro space or you have used Personal REXX to process it and stripped off the source. It would also fail when the program is processed by a true compiler so that the original source code is not available.

Testing for keyboard input

There is a function called SysGetKey in IBM's REXXUTIL function package that allows for reading a single key at a time. However, if no key has been pressed, the function will wait indefinitely. If you need to be able to test whether a key has actually been pressed you can use the standard CHARS function to test whether a key is waiting to be read.

Here is how you might use this to wait up to 10 seconds for a reply before proceeding:

say 'Press any key to continue'
do 10
   if chars() \= 0 then do     /* no argument means standard input */
       key = sysgetkey()
   call delay 1                /* pause 1 second */