Object Orientation using Object REXX

From EDM2
Jump to: navigation, search

By Derek Clarkson

Object Orientation is the domain of the Guru programmer - People who talk in strange languages and even stranger abbreviations, where such things as inheritance and polymorphism are as important as the state of the morning traffic on the south eastern. Yet it is a world as unreachable as the centre of the earth to those of us on the street. At least ...until now.

In issue #4 I looked at object orientation and told you what it is and how it works. Unfortunately at the time I had no way of illustrating it too you. But now a Beta of a new language has been released which will allow me to do this without having to give a course in C++ at the same time. A language which many of you already know, the language: Object REXX.

So what I intend to do here is to continue from where that article left off. Providing examples of object orientation rather than just talking about it and using a language which we are all familiar with.

Object REXX

So just how did they arrive at Object REXX? The heart of any object orientated language is the object. Seems obvious, but remember that the basis behind object orientation is that everything is an object. This includes the language itself and most expressly, the data types that the language offers. Look at it this way, REXX's only data type is the string. Everything is stored as strings, even numbers. REXX just does the translations for you when you perform mathematical operations.

So the first thing the Object REXX development team did was to find the basic string type into a string class (I'll look at classes later). Once this was done it was just matter of adding the other concepts in object orientation to the language. I.e. encapsulation, polymorphism, classes, inheritance, etc. Of course this was by no means easy, but the end result was that for the first time in programming history we have a basic, easy to use and accessible object orientated language.

Objects

Just to recap here. Structured programming as we know it follows a basic concept. The program you write is composed of a series of steps which perform various actions on a set of data. Not matter what program you look at, reduce it to it's basic concepts and that is what you will find. Now whilst this is a very simple way to write programs, it does have some inherent problems.

First off there is the issue of code stability. Because code and data are separate in structured programming, it is very easy to find one without looking at the other. For example changing the way that you store the data means that the code that accesses the data will no longer work. Worse, this code may be scattered all over a system and hidden in many different places. This makes the concept of changing the structure of data very dangerous, especially in larger programs. Even worse is the 'ripple' effect where a find in one piece of code effects other sections of code in ways unforeseen by the programmers.

Another issue is the scope of data. Good programmers make sure that all variables within a particular routine cannot effect things outside that routine. However because the data that a routine is playing with may be required by other routines, the chances of bugs caused by data corruption are increased. Finally there is the issue of code reusability. Within structured systems it is very difficult to create code that can be reused in other parts of the same program or in different programs. Again good programmers will quickly assemble libraries of common routines, however there is still a problem when you need a routine to do something slightly different from the way your established code already works.

All of these problems have been addressed in some way by Object Orientation. But just what is it?

Object orientation is a new way to look at code. Basically object programmers look at programs as a series of objects which inter-react with each other rather than just a series of instructions which operate on data. In object orientation, data and functions are kept together, making it easier to locate, maintain and find them with minimal impact on the program as a whole. If you think about it, this is the way the real world operates so it is not that alien. However object based programs can do things that the real world cannot.

It also allows programmers to reuse objects in other programs and to find part or all of the code without having to recode the entire object. Because code and data are together in one place, each object looks after itself, thus reducing the probability of bugs caused through corruption. Further, objects keep internal workings secluded from the outside world which means you can find the way an object works without having to rewrite the rest of the program. All of these things and more are why object orientation is the way that more and more programs are being written.

Talking to Objects

Probably the major difference between structured languages such as REXX and object languages such as O-REXX is the way we talk to objects. To illustrate this, lets look at the humble REXX string. Remember above I mentioned how the only datatype in REXX is a string no matter what you are storing in the variable, and that to create an object version of REXX it had to be find into a string object.

Normally to do something with a string you would perform an action on it. I.e. if you string has the value 'hello' then to get the first three letters from it you would have to code something like this:

x = left(var1,3)

This works very well, however it is a classic example of structured programming. Basically we are taking a routine (left), passing a string to it (var1) and a parameter (3) telling the routine what function to perform on the string. The code of the operation and the string are totally separate items. An object orientated version of the same code looks like this:

x = var1~left(3)

The new thing here is the '~' (often referred to as the Tilde) character. This is a new feature of Object REXX which tells the interpreter that we are sending an instruction to an object, often referred to as sending a 'message'. In this case we are sending a message to the string object (var1), telling it that we want it to perform a 'left' function on itself and that we are passing that function a parameter of 3.

Now this may not sound like much or indeed that different, but it does mean several things that are quite important. Firstly it means that we are talking directly to the object (in this case the string) rather than just passing data around. If you think about it, it's quite logical that a string will know more about itself than any function you might send data to. It also means that in order for the functions to be available for you to use, they must be attached to the object. Functions attached to an object like this are referred to as 'Methods'. So when writing a definition of an object you must also code the methods it can perform.

All of the functions that we are used to using in standard REXX have been associated with the string object class as methods and can be accessed in this way. You can even stack them up. For example:

x = copies(left(var1,3),2)

Could also be written like this:

x = var1~left(3)~copies(2)

Which reading left to right is actually a little easier to understand than the normal way of handing these two functions. Simply put, it says: send a 'left(3)' then a 'copies(2)' message to the var1 object.

Classes

This leads into the issue of classes. I had mentioned these further up but not said what they are. The 'Class' of an object is what defines it's behaviour and how it looks to the rest of the program. For example, the basic class used through out Object REXX is the 'String' class. Every variable that you define belongs to this class, and therefore will act according to the rules that define the class.

Confused? Look at it this way. In nature there is a 'Animal' class and some of the properties (variables) of the animal class are age, height, colour, number of legs, etc. Some of the methods (subroutines) this class has are eat, sleep, walk, run, jump, etc. I.e. the 'Doing' stuff. The important thing here is that all the animals on the planet use this class to define their properties and behaviour. From a programmers point of view, you have to define the class first before you can run around creating objects. After all, you cannot create animals without first knowing how they will behave and look.

This means that you have to code the definition of a class before you can use it to create objects. Luckily in Object REXX, the string class and many other new ones such as arrays and collections are built-in. But what if you want to find the way a class works? How do you do it?

As an example, lets look at hexadecimal numbers. Normally in REXX you cannot add two hexadecimal numbers together because REXX sees them as strings. Try it! You will have to use X2D to convert the numbers to decimal, do the mathematics on them, then use D2X to convert the answer back to a hexadecimal. If you only do this once or twice it's no great hassle, but it you want to do it all over your code you, need another way. You could use a subroutine, but again you would have to remember where to do the subroutine calls. What is really needed is the ability to get the normal maths functions to recognise hexadecimal numbers and do any conversions themselves. Effectively a new string class with modified maths functions.

The first thing we would need to do is somehow tell O-REXX that we are creating a new class. We do this through this statement:

/* REXX - the Object version */
/* define the hex class */
::CLASS hex

The '::CLASS' statement is a new type of command in REXX known as a directive. I.e. it directs the interpreter to do something. In this case, it is an instruction telling REXX that all following commands are to be used as a definition of the 'hex' class. Note that in this article I am putting the relevant keywords in uppercase so that they are easier to spot, not for any other reason. There are also a number of other directives which tell the interpreter how to setup the classes that you are going to use. Also note that when the interpreter encounters a directive in your code, it is an indication to it that the executable part of the program is finished and anything from now on is part of the definitions for the objects you are using.

This is why the executable part of an O-REXX program MUST come first, followed by the definitions of the objects it will use. With some other languages a MAIN() function must be defined which is the main executable part. This allows for these other languages to have object definitions and executable code in any order.

As I said, '::class' tells O-REXX that we are about to define a new class and that it's name will be 'hex'. However this doesn't do anything but define the classes name. It has no methods available to it because we haven't defined any. More importantly, we only want to modify some of the behaviour of the string class, not all of it. So how do we establish this without having to repeat the code for every function known to REXX?

The answer is quite simple, we tell it to 'Inherit' all the properties and methods from the string class. Like this:

/* REXX - the Object version */
/* define the hex class */
::class hexstring SUBCLASS string PUBLIC

What we have done here is to add some new parameters. Firstly we have added the 'subclass' keyword. This tells O-REXX that this new class is a 'subclass' of another class and therefore inherits all of it's properties and methods. Thus making such functions as Copies, Strip, Word, etc available in the new class without having to redefine any of them. Remember that in the string class +,-,*,/, etc are also methods and therefore we inherit these mathematical functions as well.

The class we are subclassing from comes next. In this case telling the interpreter that we are inheriting the properties and methods of the string class. Finally, we have the 'PUBLIC' keyword. This keyword is not required, however for this class of objects it is a good idea. Basically in O-REXX a particular class definition is only available within the current program. So the hex class we are creating is unavailable for other programs to use. But by adding the Public keyword, we tell O-REXX that we may wish to use this class in other programs and therefore it should be 'Publicly' available.

So now we have the new class available for use. At this point we can actually use this class and any variables that we define using it will act exactly as though they had been defined using the string class (the default if you don't specify). Now how do we find the mathematical methods?

Adding Methods

Changing current methods or even adding new ones is an amazingly simple process. Simply put, all we do is define the name of the method we wish to find or create, by using the METHOD directive and then entering the code that will be executed. Like this:

/* over ride the '+' method */
::METHOD "+"
	arg to_add
	return_value = (SELF~x2d+to_add~x2d)~d2x
	return return_value 

The '::method' is a directive which lets the interpreter know that we are defining a method. In this case we are overriding the original definition of the '+' method with our own definition. There are several things to note about this code. Firstly you will notice that the plus symbol is in quotes. Normally method names are not quoted, similar to when we defined the class. However for methods whose names are symbols like this, quotes are required so that the interpreter knows that it is the method's name.

Next look at the ARG statement. This works in exactly the same way as ARG normally works. In this case we are collecting the parameters attached to the message being sent to the object. TO clarify this, if we where defining a new method called dice_it_up and wanted to be able to pass a parameter like this:

say var1~dice_it_up("hello there")

We would require an arg line to collect the string "hello there" once we had entered the method.

The next thing to note is the mathematical statement that performs the function. In this case it converts both the current value of the object and the parameter to decimal, adds them and then converts the answer back to hexadecimal. I have highlighted the SELF word in this function because this is another new function in O-REXX.

Basically it is a way to refer to an object from within a method. We can't just use the name of the variable object we are going to create because we are defining the class here and the object does not exist at this stage. Further, it may not be the only object we create from this class. So the self keyword has been made available to reference an object from within a method without having to know the name of the object or have it presently in existence. Another point is the conversion methods being used. Notice that if there are no parameters for a message, you don't need to add brackets.

The last thing to note is the return statement. Just like normal REXX this returns from the method and passes back a string which in this case is the returned value after adding the parameter to the variable.

So lets put it all together like this:

/* REXX - the Object version */ 
hexvar = .hex~new('1a')
say "Initial value =" hexvar
say "Message based:" hexvar "+ 2 =" hexvar~"+"(2)
say "Normal:" hexvar "+ 2 =" hexvar + 2
/* define the hex class */
::class hex subclass string public 
/* over ride the '+' method */
::method "+"
	arg to_add
	return_value = (self~x2d+to_add~x2d)~d2x
	return return_value

The result of running it would look something like this:

[F:\orexx] orx revew\test1.cmd
Initial value = 1a
Message based: 1a + 2 = 1C
Normal: 1a + 2 = 1C

Now lets summarize what we have done. Looking down the code, we see the program itself first, followed by the definition of the new class we are creating. This class definition is a subclass of the string class which means that it inherits all the properties and methods of the string class. Then we have overridden the definition of the "+" method to do hexadecimal arithmetic rather than decimal arithmetic.

The program above the class definition first creates an object called hexvar. This is also known as creating an 'Instance' of a class. It does this by sending a 'new' message to the object's class definition with the value we want to initialise the object with. Notice the full stop in front of the string 'hex'. This indicates to the interpreter that the string is the name of a class and that the following message is intended for that class. If we had no way of doing this, 'hex' would just be interpreted as the name of a variable which it isn't.

After this we send the object a "+" message along with the parameter '2' and printout the results. Just for fun we do it both via message sending and normally just to show that in O-REXX you can still write stuff like this the same old way and the interpreter will sort it out.

Getting a bit more complex

Now that we have this new class we can do a few more things than we could before. However there are still some problems to be solved. For example this class always assumes that we are going to pass it a hexadecimal number to add. What if we want to add a decimal based number to the object? Or even a binary formatted one ? Or even add a hexadecimal number to a decimal object?

We could solve these problems by creating unique classes for each type of number but what would be better is to create a more generic type of number object that can react better to having different types of numbers thrown at it. Take a look at this code:

/***************************************/
/* REXX - the Object version           */
/* Using classes to create new objects */
/***************************************/

hexvar = .numbers~new(9x)
say "hexvar =" hexvar
say "Displaying hexvar + 2:" hexvar + 2
say "Displaying hexvar + 4:" hexvar + 4
say "Displaying hexvar + 6:" hexvar + 5
say "adding 2 to hexvar:"
hexvar = hexvar + 2
say "Result:" hexvar
say "Result as decimal" hexvar~x2d

/* define the numbers class */
::class numbers

/* create an INIT method to setup the object */
::method init
	/* expose the interal data areas */
	expose val type

	/* collect the parameters */
	arg val
	/* assign the internal variables */
	type = val~right(1)
	if type~datatype = 'NUM' then type = "D"
	else val = val~substr(1,val~length - 1)

/* override the x2d method to error if the object */
/* is not a hexidecial */
::method x2d
	/* expose the interal data areas */
	expose val type
	if type = 'X' then return self~class~new(val~x2d)
	Else return self~class~new(val)

/* create the method to return the current value */
::method string
	expose val type
	if type = 'D' then return val
	else return val||type

/* over ride the '+' method */
::method "+"

	/* expose the objects value */
	expose val type

	/* covert the object to decimal */
	select
		when type = "X" then return_val = val~x2d
		otherwise return_val = val
	end  /* select */

	/* collect the parameters */
	arg add_value

	/* test for a datatype qualifer and sort it out */
	type_char = add_value~right(1)
	if type_char~datatype = 'NUM' then type_char = "D"
	else add_value = add_value~substr(1,add_value~length - 1)
	select
		when type_char = "X" then ,
			add_value = add_value~x2
		otherwise nop
	end  /* select */

	/* perform the addition. Remember that we */
	/* are not storing the new value          */
	return_val = return_val + add_value

	/* reconvert if required */
	select
		when type = 'X' then return_val = return_val~d2x
		otherwise nop
	end  /* select */

	if type = 'X' then ,
		return self~class~new(return_val~x2d)
	Else return self~class~new(return_val)

When run, it produces the following display:

[F:\orexx]orx review\test1.cmd
hexvar = 9X
Displaying hexvar + 2: BX
Displaying hexvar + 4: DX
Displaying hexvar + 6: EX
adding 2 to hexvar:
Result: BX
Result as decimal 11

Ignoring the program itself at the top, lets look down through it seeing what is going on. Firstly and most importantly it defines a new class called numbers. Note that when the class is defined, I have not subclassed it from the string class. This is because I decided I didn't want it to inherit any of the methods that strings normally have, so I'm defining this one from the ground up so to speak.

The theory behind this class in quite simple. As I said earlier in this article, REXX variables are typeless which makes it rather difficult to deal with different types of numeric data. So what I am creating here is a form of a numeric object where the 'type' of the numeric is also stored. I am doing this by having two variables inside the object. One for the current value of the number and one for what type it is. I.e. decimal, hexadecimal, binary, character, etc.

A number of methods are defined for this class. The first one is a new one called INIT. Every object in O-REXX has an INIT method associated with it which is called whenever a new instance of an object is created. As the name suggests, it's job is to initialise the object, providing the data it needs. By default, O-REXX calls INIT whenever it encounters the NEW message. For example, in the first line of the actual program code you can see '.numbers~new(9x)' which executes the NEW method for the numbers class, creating an instance of the class and sending the INIT message to this instance. Hence, if you want to find the way something is initialised, this is the method to override.

The first line of the INIT methods code is also interesting. It's a normal EXPOSE line, exposing variables for use. However we are dealing with objects and classes here so it works a little different even though the syntax is the same. Expose, when used like this, opens up the object's data area for use in storing information. This is often referred to as the 'State' data because it is where you store the current 'state' of the object. Any other variables that you create and use within a method are forgotten when that method ends, exactly the same way as if you coded a subroutine with 'procedure expose'.

After this the method becomes quite simple. It collects the value the object is to have and stores it directly into the state variable. It then analyses it to see if any suffix has been added to identify what type it is. If so, it chops the extra character off and stores it in the type state variable, otherwise it assumes the number is decimal based. Obviously in future this could get more involved to validate the data being stored.

The next method is also a simple one. We are basically overriding the X2D function to do one of two things. Either perform the function as normal and return the converted value, or to ignore it and return an unaltered value. This is also where we do something quite different which is worth noting.

At the end of this method, instead of doing a normal 'return val' we do a 'return self~class~new(val)'. Why? It makes sense if you think about it, although it was not obvious to me until after I had several rounds of tracing through the code to see what was going on.

When you do a simple 'return val' O-REXX returns the string you want it to as you would expect. However, note that I said string because what is being returned is an object of the string class rather than an object of the numbers class. Thus if we then wish to do something with this new object the methods we have overridden will not be present. Look at the last lines of the '+' method further down. If I find these lines back to a simple return statement and then run the code, I will get an error on the last 'say' statement where it does hexvar~x2d.

The reason is quite simple, two lines above this, hexvar is made equal to hexvar + 2. What this actually does is to send a '+ 2' message to hexvar. Having just a normal 'return return_val' at the end of hexvar's '+' method will return a string object to the calling routine. Therefore backup in the program, hexvar is now assigned a string class object rather than the numbers class object it was. Thus when we get to the last line, we are trying to execute a x2d on a normal string which in this case holds the value '1ax'. Hence the error.

Changing these return lines to say 'return self~class~new(val)' tells the interpreter to return an value which if you follow the line though, says take yourself (self = the object hexvar), get the class which it belongs to (~class), and create a new instance of it (~new) with the value of 'val'. Thus the main program still gets back the value it requires, except that now it is a numbers object rather than a string object, and the last x2d will work, using our overridden method.

The next method is a small but important one. Called 'String', it is the default method called to display the contents of an objects state variables. In this case I have told it to return the contents of val, suffixed with the type if it is not a decimal.

The next method (+) is where all the action is. Again like the previous three it first exposes the state variables 'val' and 'type' so that it can gain access to them. Next it checks their contents and converts the value to decimal if required. Then it reads in the argument and converts it to decimal as well. It then performs the addition and finally creates a new instance of the class using the results as I did for the x2d method.

And that's it. This might seem like a long winded way of doing things. However in the long run it is very beneficial as it does result in smaller and less complex code. Especially as classes like this numbers class, once fully mapped with all mathematical methods present, can be used in any program without having to be copied or redefined.

To be Continued ...

Huh, What's this? Yes I know we don't normally do this in OS/2 Zone, but unfortunately there just was not enough space in this issue to do the whole of this article. So I will be continuing it next issue. There's plenty more to cover, especially as I have only touched the surface of what this new language can do for you.

IBM's Object REXX

Just a note here, the Object REXX I am using for this article can be obtained on DEVCON #6. DEVCON is short for the Developers Connection and is a quarterly product composed of CDs and a magazine for developers. The CD's contain product updates, fixes, betas, white papers, red books, third party shareware demos and anything else DAP can find that related to developing under OS/2. It costs about $200 a year and is well worth the money. Contact DAP at IBM for further information.

With regard to Object REXX itself. I have found this beta release to be very good. The documentation is excellent and gives you a good picture of just what it can do. Object REXX gives you practically full access to the workplace shell as well as access to SOM and DSOM objects. Further it enhances REXX with a range of classes from ordinary strings classes through to classes for arrays and containers which are capable of boolean operations on collections of objects. And much, much, more.

The beta seemed reasonably stable although I had some exceptions and crashes, it is also recommended not to install it as your primary REXX because it has a performance problems which the labs are working to solve. Other than that, I really like what I have seen, it's a superb combination of the simplicity of REXX with Object Orientation, yet it packs a range of functionality and power programming way beyond what I was expecting. Well done IBM.

Object REXX has no scheduled release date at the present time, but much pressure is being put on IBM by programmers (like myself!) to put it out as soon as the final few bugs are ironed out.