Grinding Java - Distributed Java

From EDM2
Jump to: navigation, search

Written by Shai Almog

Introduction

When I first started writing for EDM I wrote to promote the usage of Java. I am a big Java advocate (DAH), and back then I felt that many people did not learn Java simply because there were very little "hands on" real world Java examples out there. Today the situation is quite different, JavaSoft's documentation is superb and even their tutorial started dealing with the new stuff (as in Swing) before I got to it. So where do I fit in? JavaWorld, JavaUniverse and JavaSoft supply pretty much all that is needed in examples and documentation. PC Week, Wired and others supply way too many opinion columns which simply seek sensation rather than journalism. I decided to change the theme of the articles, I will stop posting the large amount of code and write about the development itself. I am currently working on several Java based projects, I believe they are quite interesting and I will (staring next month) write about the development of these projects from the angle of the lessons learned. I will also keep studying upcoming Java APIs and post overviews of their performance and resources for those APIs. You will already notice the light amount of code I produced this month, this is due to my lack of time since I have been spending most of my time working on other Java projects. Shai.

A quick introduction to distributed computing

You may be wondering what took me so long in writing an article on distributed computing. It is my second favourite subject. I have been waiting for Sun to support CORBA in RMI, and now that it has it is time to start supporting RMI against the common foe, DCOM.

What are distributed objects?

You probably know of RPC (remote procedure call) which allows a process call a procedure which may physically execute on another computer. Distributed objects invoke methods on objects which may exist on other computers.

In what way are distributed objects better than RPC?

In the same way in which OO programming languages are better than procedural languages. The mileage may vary with the type of distributed object technology (DCOM is quite similar to RPC).

I found that the best way to explain distributed computing is by explaining the technologies which implement the concepts and their differences. Currently 3 major technologies are in the distributed computing battle ground:

CORBA "Common Object Request Broker Architecture" - (My favourite) CORBA supports the creation of objects on the basis of interfaces written in a language called IDL (Interface Definition Language). A CORBA object can be written in any language (object oriented or not) and it will expose an object oriented interface. We can write an object in Java that will make use of legacy code written in C++, Smalltalk, C or even Cobol. A CORBA object supports multiple inheritance so our Java code can inherit code from every the supported language. CORBA has 16 defined services which allow CORBA programmers to add functionality to their objects using inheritance and standard interfaces.

RMI "Remote Method Invocation" - RMI is very similar to CORBA in some aspects. In fact RMI can be considered as a CORBA ORB. RMI allows a Java VM to invoke a method on another Java VM and it supplies a registry mechanism and a simple to use interface. RMI also supports passing objects by value using the serialization mechanism.

ActiveX - The Microsoft "standard" relies on an architecture quite similar to C++ vtables. ActiveX redefines the meaning of the word object to not include instances.

In order to understand fully the issues at hand I will compare each of the important aspects of the technologies.

History

CORBA: In the May 1989 a group called OMG (Object Management Group) was formed by 8 companies (Sun, 3Com, American Airlines, Canon, Data General, HP, Philips and Unisys) whose goal was to create a standard upon which object technology can blossom. Today OMG has more than 700 members (which is quite a lot considering the fact that each member is usually a billion dollar corporation) and it has completed the first two phases of its standard and is well into the third phase. OMG's technologies are standard and open to every one, OMG membership which is required for a vote on the standard is quite expensive but the resulting standards are free. There exist more than 12 implementations of ORB for almost any platform and several Java ORBs (Object Request Brokers; I know of 5). All CORBA 2.0 compliant ORBs can interact with one another seamlessly.

RMI: In JDK 1.1 JavaSoft introduced RMI which is quite odd since they also introduced JOE which is a CORBA ORB, and thus supplies very similar functionality to RMI. RMI was widely criticised by the CORBA community since it did not comply to the CORBA standard. Recently Sun announced that RMI will be implemented on top of IIOP (Internet Inter-Orb Protocol) which is the CORBA standard for ORB interaction. That announcement means that now RMI is CORBA compliant and RMI programs will work with CORBA programs as if the objects are one and the same.

ActiveX: Began its life as DCOM and network OLE. OLE1 was useless when Windows 3 first came out. One of the biggest reasons was its reliance on DDE as the underlying mechanism. Microsoft decided to construct a new mechanism which it called LRPC (Lightweight RPC) to allow the functionality of COM to sit upon it. OLE2 was then constructed upon COM. Although COM existed as a product for years, DCOM had a delayed release since MS at the time did not consider the market ready. DCOM is based on a language very similar to CORBA's IDL, ODL (Object Definition Language) since they are both derived from DCE.

Concepts

CORBA: When programming in CORBA you must map your legacy code to IDL interfaces which will look very familiar to you C++ programmers out there. An IDL interface is something very similar to a class declaration in C++ or an interface in Java. CORBA treats all objects as references, if an object is passed to another computer through a function call it will not be the actual object but a reference to the remote object. This sounds to many first time CORBA programmers as a fault, but it is not. Since CORBA supports all platforms binary code from one computer platform should not be transmitted to another, transmitting binary objects over a net can forms a nesting ground for security problems. The interface contains information of the methods that can be invoked, the classes that are inherited and the services that are supported. CORBA then defines 16 services which allow the programmer to offer truly "smart objects" which will act independently. Those services are: Life Cycle - Everything that has to do with the objects life. Moving, deleting, copying, etc. Persistence - A CORBA object never dies. If the object you requested is not running it will be invisibly loaded from the persistent storage. The object persistence is invisible and supports both ODBMS and RDBMS.

Naming
Locate an object by it name.
Event
The event service is quite similar to Java's event paradigm.
Transaction
Enables commit/rollback style transactions on objects. The Java transaction API is in fact based on this service.
Concurrency
Obtain locks on an object for a transaction.
Relationship
Create connections (associations between objects).
Externalization
Very much like the streaming API.
Query
Perform queries on objects, using ODMG's (Object Database Management Group) OQL and SQL2 spec.
Licensing
Will enable an object supplier to charge for the usage of their object.
Properties
Associate properties with an instance.
Time
Synchronise time between distributed objects.
Security
Authentication, access control, etc.
Trader
Allows you to search for an object by criteria and not name. So you can look for an object which performs a job you need.
Collection
Represents a collection. This allows you easy standard access to data structures.
Startup
Startup requests on ORB initiation.

CORBA is now in the 3rd and by far largest part of its development, the CorbaFacilties. CorbaFacilties are collections of interfaces which map the common logic of a market or reality. Using predefined global common interface with adaptions to particular markets developed by SIGs (Special Interest Groups) objects will truly be able to interact. CorbaFacilities are developed for many fields such as design, Oil and Gas industries etc. There are also vertical market facilities targeted at the general objects such as document centred programming. IBM, Netscape and Oracle have introduced a plan to incorporate JavaBeans and CORBA, this plan may prove useless once Sun accepted the RMI on top of IIOP proposal but only time will tell. RMI: Corba has a steep learning curve. CORBA's complexity stems from 2 major qualities:

  • IDL while very powerful is yet another syntax to learn, and the mapping between IDL and your programming language of choice may be quite confusing.
  • CORBA has many strong capabilities (Multiple inheritance, CorbaServices, etc.) which may turn out to be a burden of knowledge for simpler distributed systems.

RMI is very much like a regular ORB, the major differences are:

  • In JDK 1.1 RMI used a propriety protocol to perform the method invocation. RMI will use IIOP in JDK 1.2.
  • RMI allows passing objects by value. It does not have a security problem with this feature since RMI uses the Java SecurityManager to protect the machine.
  • RMI is integrated very well into the Java API and is quite easy to use, although there are several CORBA based products which make it just as easy.
  • RMI (arguably) does not scale as well as CORBA from the small applications to large corporate intranet/internet solutions. RMI is targeted smaller project. RMI's scalability is a topic of heated debate, since RMI is now being used in quite large projects (such as IBM's San Francisco project).

RMI is quite a good solution for most Java developers. It's simple to use, it's standard Java and it will connect quite well to CORBA. If you are writing small, simple distributed applications in pure Java don't think twice, RMI is for you.

Note: To get my bias out of the way, I believe ActiveX is the root of all evil mostly because the information I write here about ActiveX is true. I am biased but ActiveX is a still horrible product.

ActiveX: DCOM was based on COM which was based on an idea derived from C++: the vtables. In C++, every class which contains a virtual function (a function that may be overridden by a subclass) has a vtable which has an entry for every virtual function in the class. The vtable contains pointers to the functions in the correct subclass so when a method is invoked the vtable will send the call to the correct method. ActiveX does not support inheritance, it only supports interfaces. The most important interface in ActiveX is IUnknown which has methods to discover the other interfaces supported by the COM object. EVERY ActiveX component must implement IUnknown, this is usually invisible to the programmer which simply uses Visual C++ to automatically generate the code, but the object will be bloated anyway. COM objects will always be very large since they are always derived from large class library.

COM does not support instances at the object level. A COM object has a unique ID for the vtable implementation, yet it has no id for the object itself. Unique objects in COM are usually achieved by mapping references to the object directly to the C++ object instance. COM's speed potential is quite good since it has a simple (thus fast) vtable, yet many CORBA implementations are reported to be faster. COM has no services but it does declare many interfaces, OLE 2 is in fact a set of COM interfaces for application interaction.

ActiveX supports passing by value which means:

  • When running an ActiveX component an executable is downloaded to your computer! This is not a security risk, security risk implies the existence of security!
  • Microsoft promised to port ActiveX to other platforms. This will be impossible, without limiting this feature, or forcing programmers to rewrite the component for every supported platform.
  • ActiveX components are quite huge, and people complain about the small Java class files.

Supporters

CORBA: all of the founding members, and

IBM - SOM is IBM's implementation of CORBA. IBM is writing a new CORBA implementation which will support all the 16 services titled Component Broker, it will be on top of SOM 4. IBM is an active member of OMG who contributed to such services as persistence, transaction, etc.

Oracle - Oracle's is writing an ORB which they intend to use to migrate their current database users to object technology. CORBA's persistence services allow them to use a relational database while using true OO programming.

Netscape - Navigator 4 supports IIOP and is in fact bundled with Visigenic ORB for Java. Netscape is very committed to CORBA and their new visual JavaScript tool supports CORBA. Netscape server software is bundled with Visigenic ORB too.

CORBA success stories may be found at http://www.corba.org/csstory.htm OMG is at http://www.omg.org/

RMI: RMI is rapidly growing and will probably reach a larger audience now that CORBA is no longer an enemy. RMI's biggest supporter is JavaSoft, I haven't heard of any large organizations embracing RMI. The biggest commitment I have seen for RMI is in IBM's San Francisco frameworks which will use RMI.

ActiveX: ActiveX is supported by Microsoft who created the "open group" whose purpose is to turn ActiveX into an open standard (good luck); they are at http://www.activex.org. The ActiveX supporters divide among 2 groups, those with a heavy windows stake, and those who go both ways (ActiveX and CORBA).

The first group: Adobe Systems, Computer Associates International, Microsoft Corporation, Visio.

The other group: DEC and Hewlett Packard have CORBA ORB's too and Powersoft has CORBA development tools.

Compatibility

CORBA: can be used in any programming language which has appropriate bindings. CORBA was designed to be language independent, i.e. CORBA supports multiple inheritance which is not supported by Java yet you can still use multiple inheritance in CORBA-Java objects, CORBA even works with C which is not even object oriented. CORBA has standard bindings to allow it to interact with ActiveX components, that way CORBA objects may be manipulated from visual development environments which support only ActiveX (such as Delphi and VB).

RMI: RMI can only run on top of Java (which is also its strongest advantage of simplicity). If RMI's IIOP protocol is implemented well, RMI will be able to connect to CORBA ORBs and through them to ActiveX components.

ActiveX: Was designed in C++ and has strong bounds to that language. ActiveX components can potentially be written in other languages (notably Java and VB) but they don't blend well into the language. As a side note: an ActiveX component written in Java will still not be portable or small. ActiveX may interact with a CORBA ORB if that ORB supports that interaction, Microsoft refuses to support IIOP.

Summary

CORBA is a very strong solution even if your project uses only Java. CORBA should not be ignored as a solution, yet RMI may be a simple straight forward shortcut and could possibly scale to much larger projects.

I feel ActiveX is the devil incarnate, hopefully you will feel the same once you study the facts about that technology.

I am attaching an example of a very simple RMI program, I wanted to present a CORBA example too but I don't have time to get into the Java CORBA bindings.