Encapsulation, and the EPIC Nature of Dogs

By Roger Sessions

If we know that Lassie is a dog, then we know something about Lassie. If we know what dogs in general do and don't do, then we know what Lassie will and won't do.If we know that dogs bark when we ring the door bell, then we know that Lassie will bark when we ring the door bell. If we know dogs can't fly, then we know Lassie can't fly.

But knowing that Lassie is a dog doesn't tell us everything about Lassie. It tells us what she does and under what circumstances, but not how. Unless we are an expert on the implementation of dogs, we have no idea how Lassie actu ally goes about the business of barking. We don't know how the millions of neurons in her brain fire in the correct sequence, what muscles are involved, or how she makes her vocal chords vibrate. All we know is that given an appropri ate stimulus, Lassie goes "Woof Woof."

In fact, we can't even prove that Lassie is really doing the barking. For all we know, Lassie is using a tape recorder. Or Lassie might be nothing more than an elaborate puppet and her bark com ing from a good ventriloquist. Or maybe Lassie calls in a barking special ist whenever she hears the doorbell and Lassie acts as nothing more than a noise broker.

It is said that the truth shall set us free. In this case at least, it is our igno rance that sets us free. Or more precise ly, sets Lassie free. The less we know about Lassie, the more freedom Lassie has in her implementation. As long as she can figure out some way to make a whole bunch of noise when the door bell rings, we will be happy, feed her, and not ask silly questions.

In object-oriented programming, we call this general principal encapsulation. Encapsulation says that we ask objects only what they do, not how they do it. An object, say Lassie, is encapsulated if its interactions with its client are deter mined only by its interface and not by its implementation.

Comparison of three languages

Lets look at the concept of interface from the perspective of three different programming languages, all supported on OS/2. These languages are C++, SOM's IDL (Interface Definition Language), and good old reliable C.

We describe objects by their inter faces. Loosely speaking, we can think of an interface as describing the behaviors or the methods that a class supports.

Our dog class will support two meth ods, a setBark method, which is used to tell the dog what its bark is, and a bark method, which is used to tell the dog that the time has come to bark. Since Lassie is a dog, we know she will sup port these methods.

In C++, using for example IBM's Vi sualAge C++ on OS/2, we could describe our dog interface as:

 class dog {
           public:
             void setBark(char *newBark);
             void barkO;
           private:
             char *myBark;
 };

C++ breaks a class definition into two sections, a public and a private section. The public one is the class's inter face. The private section is intended to be of interest only to the programmer implementing the interface and can change without warning.

IDL (Interface Definition Language) is defined by the Object Management Group (OMG). It's unique in that the choice of the language used in the implementation of the class is not con strained by the language used to define the interface. In contrast to IDL, if we define our interface in C++, we can use only C++ for implementing our class.If we define our interface in IDL, we can use C++, C, COBOL, or any other lan guage that supports IDL.

IBM considers IDL strategic and is supporting it, or planning on supporting it, with all of its languages on all of its platforms. The IBM implementa tion of IDL is called SOM, for the System Object Model. SOM has been available on OS/2 for a number of years now.

In SOM IDL we can describe our dog interface like this:

#include <somobj.idl> 
interface dog : SOMObject {  
  void setBark(
    in char *newBark); 
  void barkO; 
    implementation {
     callstyle = "oidl ";
    };
};

This dog interface contains a great deal of information, more than one might expect by looking at this decep tively simple definition. However, we are only interested in encapsulation and from this perspective the IDL dog inter face defines the same two methods as the C++ definition, setBark and bark.

This IDL definition, similar to its C++ counterpart, tells us that the set Bark method takes a single string para meter and that the bark method returns a string. Like C++, it doesn't tell us how the dog stores that string. Extending the idea of encapsulation beyond C++, the IDL definition doesn't tell us in what language the code implementing setBark and bark is written. Also, IDL does not support the concept of public and private sec tions. If it's not part of the public inter face, it shouldn't be part of the IDL. At least, it shouldn't be part of the IDL you allow clients to see.

A client that understands the con tract implied by the dog definition can write code against that contract. One example of C++ client code using this dog looks like the following:

#include "dog.hpp" 
int main(}
{
 dog *Lassie; 
 Lassie = new dog;
 Lassie->setBark( 
   "Woof Woof"); 
 Lassie->bark();
}

On OS/2, this C++ source code is compatible with a dog defined in IDL and implemented in either C++ or C. It is also compatible with a dog both defined and implemented in C++.

A C client can also use this dog if it is defined in IDL. This client looks like:

#include <dog.h>

int main(}
{
dog Lassie;
Lassie = dogNew<>;
_setBark(Lassie,
   "Woof Woof">;
_bark(Lassie);
return(O);
}

This C code is compatible with a dog defined in IDL and implemented in either C or C++. It is not compatible with a dog defined and implemented in C++, because C++ does not support the use of its classes by other languages.

We can even define and implement our encapsulated dog completely in C. An equivalent C definition for a dog is:

struct dogType {
  char *myBark;
};

typedef struct dogType *dog;

void _setBark(dog thisDog, chr *newBark);
void _bark(dog thisDog);
dog dogNew(void);

and a C implementation of this dog is:

#include "dog.h"
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

void _setBark( dog thisDog, char *newBark)
{
 thisDog->myBark = (char*)malloc(strlen(newBark)+1);     
 strcpy(thisDog->myBark,newBark);
}
void _bark(dog thisDog)
{
printf("%s\n", thisDog->myBark);
}
dog dogNew()
{
return (dog)
  malloc(sizeof(dog));
}

This C implementation is compatible with the C client that we looked at earlier, even though that client was written to use an IDL implementation.

We say that these objects are all encapsulated, because the client code using these objects has no dependencies on the objects' implementations. All of the client interactions with these objects are defined by the objects' interface.

Writing objects that are nonencapsulated

Just as we saw that we can write encapsulated objects in nonobject-oriented languages (such as C), we can also write poorly encapsulated objects in state-of the-art object-oriented languages.

For example, we know our dog has to store its bark string. A nonencapsulated implementation could store the bark string in global memory allocated by the client. In order to use this dog, the client must allocate a buffer, declare a global variable with a particular name, and set that variable to the previously allocated buffer.

Because this implementation of dog requires the client to know quite a bit about how the dog manages its string storage (information that is not part of the interface), we say this dog implementation is not encapsulated. And we can write this sorry code in SOM, C++, or C.

EPIC objects

Encapsulated objects have what I call EPIC characteristics. EPIC stands for Exchangeable, Protectable, Isolatable, and Confidential. These characteristics are so important that even products like Microsoft's OLE, which rejects the other key ideas of object-oriented programming, accepts the importance of encapsulation.

Lets consider each of these in turn.

Exchangeable

The E in EPIC stands for Exchangeable. Different implementations of well encapsulated objects can be exchanged for each other without impacting their clients.

Lets consider two possible SOM implementations of our IDL dogs, both in C. The first stores the bark string in an internal character buffer as shown in Listing 1.

(LISTING 1)

The second implementation stores the bark string in a file as shown in Listing 2.

(LISTING 2)

Which is the right implementation? Both. Either works fine for our immediate needs. Because the dog is a well encapsulated object, these two implementations can be exchanged for each other without requiring source code changes to our client. In fact, by using SOM on q$/2, these changes can even be made/at run time by a simple DLL replacement.

Our nonencapsulated dog, the one using the client-allocated global buffer, is not exchangeable with these two encapsulated versions. Without the client changing its source to allocate that buffer, that dog will not bark.

Exchangeability gives the object implementor considerable flexibility. For example, it's a common practice to prototype interfaces with a simple implementation and then add more robust, better performing, or less limit ed code later in the development cycle.

Exchangeability also gives flexibility to the client, who can write code with one implementation of dog and then find, write, or purchase a better implementation at any time.