Answers to Frequently Asked REXX Questions
In this month's column, I am going to cover some of the topics that result in the most frequently asked questions by new REXX programmers. These subjects include: the PARSE instruction and the use of templates, function use via the CALL instruction versus functions used as expressions, trap processing, and registering REXXUTIL in your startup.cmd file.
Contents
The PARSE Instruction
The PARSE instruction is one of the most powerful instructions contained in the REXX language. Two references that provide a full explanation of the PARSE instruction and all of the variations of template patterns that can be used with PARSE instruction can be found in The Rexx Language and IBM's Procedure Language/2 REXX Reference.
It allows strings to be separated (split) and the resulting segments to be assigned to variables, left to right. Control over how the string is parsed is determined by either a user-supplied template or an implied default template. The form of the template is an area that leads to some confusion with new REXX users.
There are 7 different forms of the PARSE instruction; however, templates determine how the source string is split in every case. The following definitions of the PARSE keyword instruction and template symbols are copied from the REXX Reference Summary Handbook:
Table 1 - PARSE Formats
- PARSE [UPPER] ARG [template]
- Parses the arguments according to template from a function or subroutine call, optionally first translating them to uppercase.
- PARSE [UPPER] LINEIN [template]
- Parses the input from the default character input stream according to template, optionally first translating it to uppercase.
- PARSE [UPPER] PULL [template]
- Parses the next line in the REXX data queue according to template, optionally first translating it to uppercase. If the queue is empty, lines will be read from the standard input stream (normally the keyboard).
- PARSE [UPPER] SOURCE [template]
- Parses the program's source information (3 tokens) according to template, optionally first translating it to uppercase.
Example:
OS/2 COMMAND C:\OS2\REXXTRY.CMD OS/2 SUBROUTINE D:\OS2\rexxtry.CMD
Note: If issued within a subroutine, the information reflects the parent.
- PARSE [UPPER] VALUE [expression] WITH [template]
- Parses the value of expression according to template, optionally first translating it to uppercase.
- PARSE [UPPER] VAR name [template]
- Parses the value of name according to template, optionally first translating it to uppercase.
- PARSE [UPPER] VERSION [template]
- Parses the information describing the language processor and level followed by its date, according to template, optionally first translating it to uppercase.
Example:
REXXSAA 4.00 08 Jul 1992 REXXSAA 4.00 10 Feb 1994 (V3) REXXSAA 4.00 24 Aug 1996 (V4) OBJREXX 6.00 12 Jul 1996 (OBJ) REXX/Personal 4.00 12 Oct 1994
Table 2 - PARSE templates
A list of symbols separated by blanks or patterns which include:
- variable name
- the name of a variable to be assigned a value
- literal
- used to match within the input string
- (variable name)
- variable whose value is used to match the input string
- . (period)
- a placeholder that receives part of the input string, except that no assignment is actually performed
- integer
- absolute character position in the input string
- =integer
- same as preceding
- +integer
- relative position in the input string
- -integer
- same as preceding
- =(variable name)
- variable whose value specifies an absolute character position
- +(variable name)
- variable whose value specifies a relative character position
- -(variable name)
- same as preceding
In addition, a comma can be used in the template for PARSE ARG to indicate that the next argument becomes the input string for the following portion of the template.
In its simplest form, the PARSE instruction is used to separate a string into words. A word is considered to be a string of non-blank characters delimited by one or more spaces. When used to separate words, the PARSE instruction processes multiple spaces as if they were a single space. Each successive word, from left to right, is assigned to the variable names specified in the template with special consideration given to the last word. Every variable name included in the template is assigned a new value. If more words exist than there are variable names; everything following the word assigned to the next to the last variable, including the remaining spaces, is assigned to the last variable. Here is an example:
PARSE VALUE 'The quick red fox' WITH, a, b, c
This will result in "The" being assigned to the variable a, "quick" being assigned to the variable b, and "red fox" being assigned to the variable c.
If four variables were specified in the template in this example rather than three, then the first three words would be assigned to the first three variables with the surrounding spaces having been stripped prior to the assignment. The last variable would still have leading spaces though. These spaces could be removed using the STRIP() built-in function in another assignment statement.
If more variables are specified in the template than there are words in the source string, the extra variables are assigned a null string value ().
A literal string may be used to specify the delimiter that the PARSE instruction uses rather than the implied delimiter of a space. Any number of literal strings and variable names may be specified. For example:
PARSE VALUE 'Thurs., Sept. 29, 1994' WITH, day_abbrev ',', month_day_year
would result in "Thurs." being assigned to the variable day_abbrev and " Sept. 29, 1994" (note the leading spaces). However,
PARSE VALUE 'Thurs., Sept. 29, 1994' WITH, day_abbrev ',', month, day, year
will result in the value "Thurs." being assigned to the variable day_abbrev and the values "Sept.", "29,", and "1994" being assigned to the variables month, day, and year respectively without any extraneous spaces.
Any combination of delimiters may be used in a single parse instruction. If the delimiter is assigned to a variable, the variable name must be enclosed within parentheses. The above example could be written as:
comma = ',' PARSE VALUE 'Thurs., Sept. 29, 1994' WITH, day_abbrev (comma), month, day, year
The next variation of the parsing template is the numeric positional pattern. This is similar to the literal string shown above except that the numeric positional pattern indicates the position at which the next token from the source string begins. In the following example, all spaces are single spaces:
PARSE VALUE 'Thurs., Sept. 29, 1994' WITH, day_abbrev 6, . 9, month 15, day 19, year
results in "Thurs." being assigned to the variable day_abbrev, the period placeholder results in the comma and space being bypassed and the values "Sept. ", "29, ", and "1994" being assigned to the variables month, day, and year, respectively. Both month and day would contain a trailing space.
The template may contain a pattern of relative positional values rather than absolute values. The positional numbers may be relative to the last number used;
therefore:
PARSE VALUE 'Thurs., Sept. 29, 1994' WITH, day_abbrev +6, . +2, month +6, day +4, year
would yield the identical result as the previous example.
As with literal string patterns, the positional patterns can be specified as a variable by inserting the variable name in the template enclosed in parentheses, in place of the number. An absolute position number is indicated by the use of an equal sign (=). The relative indicator (+, - or =) must precede the left parenthesis. Therefore, the above example could also be written as:
abbrev_lgth = 7 month_pos = abbrev_lgth + 2 day_pos = month_pos + 6 day_lgth = 4 PARSE VALUE 'Thurs., Sept. 29, 1994' WITH, day_abbrev =(abbrev_lgth), . =(month_pos), month =(day_pos), day +(day_lgth), year
Function Usage
Functions play a very important role in REXX since they provide all of the facilities beyond those features provided by the limited keyword instruction set designed into REXX.
The REXX functions fall into two categories: built-in and external. [#Figure 1 Figure 1] contains a list of the names of all of the built-in functions with OS/2 REXX. All of these built-in functions (except for those marked with *) are defined in the REXX language specification and therefore, should be available across all platforms that provide the REXX programming language.
|
|
|
|
All other functions are external functions and are defined within either another REXX program or a DLL (dynamic link library) module. The REXXUTIL DLL is a part of OS/2 REXX and contains the functions shown in Figure 2.
|
|
|
The functions provide access and control of OS/2 related functions and data. Other, third party external function modules are available both as freeware, from various OS/2 related BBSs, as shareware, and from commercial vendors.
All REXX functions may be either referenced with the CALL keyword instruction or used any place in a REXX clause where an expression is allowed.
When explicitly called, each parameter passed to the function, except the last, is terminated with a comma. For example, CALL DATE 'N' results in the system date, in the format dd Mon yyyy (where dd represents the current day, Mon represents the current month abbreviation, and yyyy represents the four digit year). This result is stored in the special REXX variable RESULT. The user program may then assign the contents of RESULT to another variable. A CALL to any other function causes the value returned by the call to the DATE() function to be replaced.
The above example could be written using the function as an expression. For example, todays_date = DAY('N'). When any function is used as an expression, the function name must be immediately followed by a left parenthesis (no intervening space is allowed). Any number of spaces may surround the fields which occur within the parenthesis.
It is obviously more meaningful to use the DATE() function as an expression; however, other functions, particularly where the value they return is not significant, are more meaningful to the observer when used with the CALL keyword instruction. Remember that the RESULT special variable is assigned a value when a function is used.
Trap Processing
REXX provides the user with the ability to "trap" unusual occurrences. Trap means to alter the program flow when a specified event occurs and provide the program with the information necessary to identify the event. there are six events which may be monitored: ERROR, FAILURE, HALT, NOTREADY, NOVALUE, and SYNTAX. Each of these conditions is briefly described in the rexx.inf file on your system and can be found by doing a search on the word signal.
While a full explanation of all of these six conditions goes beyond the scope of this column, I do want to describe the use of the NOVALUE condition and how it should be used.
Since REXX is an "untyped data structure" language (meaning that variables need not be previously defined prior to their use, many programmers feel uncomfortable when taking advantage of this convenience in the REXX language. Still others decry this coding style as a negative characteristic of the language.
Much of this negative criticism can be overcome by use of the NOVALUE condition associated with either the CALL or SIGNAL keyword instruction. Either of these two instructions, when used with the NOVALUE condition, cause program flow to be interrupted and redirected to the label specified within the CALL or SIGNAL clause to receive control when an initialized variable is used in a program for any purpose other than to assign a value to it.
Figure 3 contains both the source statements and the output of the TRACE instruction which result from the presence of an uninitialized variable along with the reference to the variable being trapped.
Figure 3
/* 9401FIG3.CMD - Show NOVALUE */ /* 0001 */ trace r /* 0002 */ signal on NOVALUE name NOVALUE_TRAP /* 0003 */ a = 1 /* 0004 */ c = a + b /* 0005 */ exit /* 0006 */ /* 0007 */ NOVALUE_TRAP: /* 0008 */ say 'NOVALUE trap on source line' SIGL /* 0009 */ exit /* 0010 */ 3 *-* Signal On NOVALUE Name NOVALUE_TRAP; 4 *-* a = 1; >>> "1" 5 *-* c = a + b; 8 *-* NOVALUE_TRAP: 9 *-* Say 'NOVALUE trap on source line' sigl; >>> "NOVALUE trap on source line 5" 10 *-* Exit;
When reference is made to variable b, control transfers to the label NOVALUE_TRAP since b is an uninitialized variable (it has not had a value assigned to it).
In summary, the CALL and SIGNAL instructions should always be used in REXX programs of any length to prevent the unintended use of uninitialized variables as well as other exception conditions that can occur within a REXX program.
Registering External Modules in startup.cmd
The RxFuncAdd() function makes a DLL known to the REXX interpreter in any given session. However, the entry points within the DLL are not really registered until one of two conditions occur: either the individual function entry points within the DLL must be called or referenced in an expression or an entry point designed into the DLL which registers all of the other entry points in the DLL must be referenced.
Once a function name, or entry point is registered, it is then available to all sessions. REXXUTIL, which is part of REXX distributed with OS/2 is a DLL that provides an entry point that allows all of the individual functions with the DLL to be registered. That entry point name is SysLoadFuncs.
Since almost every REXX program you use will want to have access to the various functions with REXXUTIL, and since a single CALL to SysLoadFuncs() results in all of the functions in the DLL being available to all other OS/2 sessions, by placing a registration call to REXXUTIL and its self-registering entry point in your startup.cmd file in the root directory of your OS/2 drive, it then will not be necessary to register REXXUTIL in any of your other REXX programs.
If you already have a startup.cmd file, you probably already have the following included in it. If you do not have a startup.cmd file, simply create one with your favorite ASCII editor with the instructions in [#Figure 4 Figure 4]. The REXXUTIL functions will then be available in all other sessions.
Figure 4
/* Register REXXUTIL in STARTUP.CMD */ if RxFuncQuery( 'SysLoadFuncs' ) = 0 then return /* function registered */ if RxFuncAdd( 'SysLoadFuncs' , 'REXXUTIL', 'SysLoadFuncs' ) = 0 then do call SysLoadFuncs if RESULT <> then do Say 'SysLoadFuncs ' ||, 'returned ' ||, RESULT exit end end else do Say 'RxFuncAdd returned ' ||, RESULT ||, ' registering REXXUTIL' exit end