OpenDoc and Human-Computer Interaction

From EDM2
Jump to: navigation, search

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation

By Ralph M. Pipitone


This article describes how OpenDoc and advancements in human-computer interaction (HCI) can work together to enhance the usability of computer systems. It also presents an example of using a system that is integrated with OpenDoc and these HCI features.

OpenDoc compound documents will be the vehicles traversing the information highway, carrying us into the next century. This article provides a view of what those vehicles may be carrying.

Offering tremendous new opportunities to apply the technologies associated with human-computer interaction (HCI), OpenDoc components can utilize the many new HCI modes of information input and output. They can also be participants in the exciting new world of agents.

OpenDoc mechanisms allow components to coexist, communicate, and interact in an environment of unanticipated diversity. OpenDoc allows the collection of information, knowledge, and intelligence in a consistent, platform-independent, industry-standard fashion. For the first time, computers will be able to share wisdom, not just data. Wisdom is defined as the computer's ability to learn and apply knowledge.

OpenDoc brings us one large step closer to achieving the dream of a computer easy enough for anyone to use. Speech and handwriting recognition lets you interact in more natural ways with computers. OpenDoc lets you achieve your objectives more naturally, too. The world of components created by OpenDoc allows you to mix and match off-the- shelf products to produce exactly the results you need. And that's not all: Dissemination of the information produced will be more direct, easier, and more widespread.


OpenDoc is a packaging of technologies available from Component Integration Laboratories (CI Labs) that define an architecture for compound documents.

CI Labs is an industry consortium whose key members are Apple Computer, IBM, and Novell. Each has contributed significant aspects to the combined OpenDoc package. Apple contributed OpenDoc, the component collaboration architecture; Bento, the file- format architecture for persistent object storage; and Open Scripting Architecture (OSA), the scripting architecture that supports application-independent scripting on multiple platforms. IBM contributed System Object Model (SOM) and distributed SOM (DSOM), the Object Management Group's (OMG's) Common Object Broker Request Architecture (CORBA)-compliant object model providing language- and location-independent object accessing. And Novell contributed Open Linking and Embedding of Objects (OLEO), providing seamless interoperability between OpenDoc and Object Linking and Embedding (OLE) components.


A simple document contains one kind of content. The content might be text, a graphic, or a movie. A compound document, then, is a document containing more than one kind of content. For example, a document that contains text and graphics is a compound document. However, OpenDoc does not limit the notion of a document to something that can be put on paper. Therefore, a compound document can include any combination of content--text, audio, video, and images.


The different pieces of a compound document are called parts. There are text parts, image parts, chart parts, and so on. Each part contains the data for its type. Each type of part has a part handler that allows you to view and edit the part's contents.

Since each part is produced separately, there must be rules for how the parts can interact or collaborate. The OpenDoc component collaboration architecture from Apple defines these rules. The rules determine how the visible parts share the screen, how parts know what you are doing, and how parts contain other parts.

Data that represents the contents of a part can be depicted in two ways: 1) a copy of the data can be included in the part (replicated), or 2) a reference to the data can be included (a link). The differences are similar to those in any shared data environment. If a part has its own copy of the data, then changes to the data affect only that part. If a part has a link to some data, then changes to the data are seen by everyone having a link to the data.


Replicating the data into a part is the most natural way to depict the contents of a part. It is the appropriate choice for the original copy of the information. It is also used when all you require is the current value of the data. However, if the data is likely to change and you need to be aware of the changes, replication leads to the classic data-consistency problem, in which various reports of the same data contain different information.


Linking is a convenient way to provide consistent data between sharers of data. Another advantage of linking is location transparency. Thanks to DSOM, links to data are not restricted in scope; that is, a link to data can be within the context of the same document, different documents on the same system, or different documents on different systems.


Another key feature of OpenDoc parts is that they can be scripted. Scripting allows customization beyond what is provided by the developer of the part. OpenDoc scripting via OSA allows a finer granularity of scripting than has been possible with other scripting architectures. Administrators, installers, or end users can fine-tune a particular part in a particular document to behave exactly as desired.

Each OpenDoc part has a set of semantic events that it understands. When a part is added to a document, it tells OpenDoc which events it understands. When one of these events occurs, OpenDoc notifies the appropriate part.

OSA defines a platform-independent way of expressing that an event has occurred. Scripts may be attached to a part and set to run when certain events occur (for instance, selecting a button).


As with all documents, OpenDoc documents need to be distributed. Because the Bento file format architecture is implemented within a file of OpenDoc's underlying file system, document distribution is essentially a file transfer between the document's source and destination.

The nature of an OpenDoc compound document is so general that many OpenDoc documents do not resemble traditional documents at all. Some OpenDoc documents appear to be more like application programs (e.g., a clock part). Others seem more traditional, with text in a text part and graphs in chart parts. Most documents are combinations of traditional document parts and newer "live" parts featuring audio, video, animation, etc.

When documents are distributed, only the data (or links to the data, depending on how the data is depicted in the document) travels with the document. The receiving system must provide the OpenDoc part handlers to view the document. If the receiving system does not have the exact part handler that was used to create a part, OpenDoc tries to find a part capable of handling the part's data type. OpenDoc even tries to translate the data into a type for which the system has a part handler.

Documents with attached scripts, together with the intelligence contained in the script for using the data, form the basis to distribute wisdom and autonomous agents.

When a graphical user interface (GUI) delivers an event to a window containing an OpenDoc compound document, OpenDoc supplements the GUI event-handling with two abstractions: a dispatcher and an arbitrator. The dispatcher is an object that routes the event to the correct part handler within the document. An arbitrator is also an object that provides a way for parts to tell the dispatcher who owns a shared resource, such as the keyboard stream or the menu bar.

You can extend both the dispatcher and the arbitrator. At any point, you can add a new focus, called a focus module to the arbitrator. Similarly, you can add dispatch modules to the dispatcher to handle new kinds of events and behavior.

Human-Computer Interaction

Using all the components mentioned above, OpenDoc gives you a whole new experience. Today you may find yourself stuck with an application that doesn't provide exactly what you want because you cannot afford (either economically or educationally) to buy or use a completely different application.

OpenDoc components are more specialized than today's applications. One component does not do everything. If you don't have the necessary component or are unhappy with the one you already own, you can obtain an OpenDoc component with the appropriate characteristics for a smaller financial and educational investment than for a whole new application.

With the use of advanced interface technologies such as speech synthesis, handwriting and speech recognition, and some standard connection capabilities such as the Internet, you will be able to interact with the computer to produce the results you want in more natural ways than ever before. Take, for example, the production of a report. Using Web searching facilities, you can quickly obtain many references about your topic. Using speech synthesis, new HCI Web browsers can read text references to you so that you can determine their appropriateness. Once you get an adequate amount of pertinent material, you can begin producing your report. Your finished report can be composed of text, numeric data, audio clips, images, or videos--all of these data types are available on the Internet now.

Today the Internet data is not in the form of OpenDoc parts. You can, however, use the following example to make OpenDoc parts from this data.

Making OpenDoc Parts

The process of making OpenDoc parts from Internet data is simple enough. OpenDoc comes with standard parts for all the data types mentioned above. Therefore, for each reference that you want to include in your report, you simply copy the Internet data into a part of the appropriate type.

You can dictate the accompanying text in the desired context. You can then further edit and customize the data using combinations of pointing and spoken directives. For instance, you can change a font by selecting a passage of text with a pen swipe and saying "font Helvetica bold 12." You can draw graphs, shorten audio tracks, enhance images, and highlight colors to suit your taste. To perform all these activities, you can speak, gesture with a pen, or select from a menu with a mouse or other pointing device. Combined into a single document and arranged to present the desired message, the data takes on a polished professional look in a modicum of time.

What will this scenario be like as OpenDoc becomes more pervasive? Even the most sophisticated Web searcher contains little information about the requester--you. But it is not hard to imagine an OpenDoc Web searcher document that includes not only search criteria but a profile of you (perhaps obtained from observing your other uses of the system), providing a more detailed insight into your interests and intent.

You will be able to easily send such a document with a script attached out on the Web, where it can perform intelligent searches and put links back in a response document for you. The search can continue until it exhausts all possibilities, exceeds a time limit, acquires a certain amount of information, or spends a certain amount of money. With the capability of sending both data and script, the potential seems limitless. Let your mind run free--a whole new world of computing seems possible!

OpenDoc: Part of the HCI Future

This article has described some of OpenDoc's features that enhance its usefulness in the world of human-computer interaction. These features, when merged with HCI technologies, can enhance and simplify the human experience with computers. OpenDoc with HCI technologies will be the basis of many future user interface enhancements.

OpenDoc is a new, highly flexible programming model as well. New user interface metaphors are easily modeled within an OpenDoc document. The addition of new HCI technologies will fuel the search for easier and more natural interfaces between humans and computers. OpenDoc will be a part of that future.