32-Bit OS/2 Exception Management

by Monte Copeland

Under 16-bit OS/2 architecture, a process cannot handle access violations and certain other exceptions; the system invariably terminates the process. The only choice a program has is to register an exit-list function using the DosExitList API. Then, at process-termination time, OS/2 calls each of the registered exit-list functions, and they perform cleanup before the termination of the process. This approach is process-granular. It allows for cleanup, but not recovery.

Under the 32-bit OS/2 environment, the approach is thread-granular. OS/2 keeps a chain of exception handler functions for every thread. When a thread causes an exception, OS/2 walks the chain and calls each of the functions until one reports "handled". If no function handles the exception, the system takes default action. For many exceptions, the default action is process termination.

The exception management APIs are new in the 32-bit OS/2 operating system. They are available to 32-bit executables and dynamic link libraries (DLLs). OS/2 designers intend for 32-bit exception management to be hardware-independent, to be a superset of traditional 16-bit exit-list processing, to encompass 16-bit signals, and to provide thread-granular recovery of exceptions.



''Figure 1. Chain of Exception Registration Records. A pointer to the first record in the chain is stored in the thread information block (TIB) structure.''

This article describes the following exception handler scenarios:
 * A function recovers from the error and reports "handled" by returning XCPT_CONTINUE_EXECUTION. The function continues to execute.
 * A function does not handle the exception and reports "not handled" by returning XCPT_CONTINUE_SEARCH. Other handlers in the chain get a chance to handle the exception.
 * The third option is graceful failure. This approach is nicely suited for worker functions in EXEs and DLLs that must remain robust in spite of bad parameters or killed threads.

Adding a Handler to the Chain
Use the API DosSetExceptionHandler to insert an exception handler for the calling thread. This API performs an insert-at-head operation; therefore, the last handler inserted is the first one called at exception time. It is quite possible for one handler to serve numerous threads, but each thread must call DosSetExceptionHandler.

The OS/2 Developer's Toolkit defines a exception registration record structure called EXCEPTIONREGISTRATIONRECORD, but you can define your own. See Figure 1. (More later on why that is a good thing to do.) The absolute minimum exception registration record is a structure that contains two 32-bit pointers: a pointer to the next exception registration record in the chain and a pointer to the handler function. Figure 1. REGREC definition and handler function prototype

Assign the pointer regrec.pfnHandler then call the DosSetExceptionHandler API. The system assigns regrec.pNext. See Figure 2. Figure 2. Code fragment shows REGREC declaration and use.

Recoverable Exceptions
When an exception handler returns handled, the handler has recovered from the exception, and execution resumes at the point of the exception.

One scenario involving recoverable exceptions is NPX (80387) emulation. For example, compile a program with hardware floating-point instructions, and run it on a system without a floating-point coprocessor. Executing the floating-point instruction causes OS/2 to raise a coprocessor-not-available exception.

An exception handler emulates the floating-point instruction in software. In fact, this scenario describes one of OS/2's default exception handlers. Code compiled with floating-point instructions runs under OS/2 on systems without a math coprocessor.

Another scenario involves sparse allocation of memory. In 32-bit OS/2, DosAllocMem allocates memory in a collection of 4K pages. (The size of every DosAllocMem allocation is always rounded up to the next higher multiple of 4K.) The pages within a memory allocation can have different attributes: notable ones are committed and invalid. The DosSetMem API lets you commit individual pages within a memory allocation.

Sample Program 1 uses the DosSetMem API in an exception handler to commit memory as it is referenced. The sample program allocates a memory object such that no pages are committed. Then, it writes to the memory. This causes a page fault, and the system delivers an exception to the handler. The handler commits the memory, returns handled, and the system restarts the instruction. Sample Program 1: sparse.c

Graceful Failure - When Good Threads Go Bad
Some exceptions are not so easy to restart. Can an exception handler fix a bad pointer during a general protection fault? Probably not. Should an exception handler choose a new divisor after division by zero? No. The operation must fail - but gracefully.

Graceful failure is important to APIs. API worker functions must return sensible, failing result codes to the caller in error situations.

Worker functions use an exception handler like a safety net. If a thread goes bad while executing a function, the safety net is there to catch it. For the net to be in place, the worker function registers a handler at function entry and removes it at function exit. The overhead is small, and it is worth the robustness gained.

Getting There from Here
In Sample Program 1, OS/2 lifts the thread from the point of the exception, makes it call the exception handler, then drops it back on the faulting instruction. This is no good for graceful failure. Yes, it is desirable to jump back to the worker function, but not at the point of the exception!

Instead, the thread must jump from the exception handler function to a known point in the worker function. This is an interfunctional GOTO. Debates still rage about GOTO, but most programmers accept them when it comes to exception management.

Code interfunctional GOTO's in C, using setjmp and longjmp. Use setjmp to record the state of the thread at the beginning of the worker function. Later, from the exception handler function, use longjmp to return the thread to the saved state. State information is stored in a variable of type jmp_buf.

The exception handler function must have addressability to the jmp_buf to use it on the call to longjmp. The stack frame of the worker function is the ideal place to hold the jmp_buf and the exception registration record. Also, a pointer to the except ion registration record is one of the parameters to the exception handler function. Therefore, the way for an exception handler function to get the address of a jmp_buf is to put a jmp_buf at the end of the exception registration record. See Figure 3. ''Figure 3. Extended REGREC definition''

Sample Program 2 consists of the main function, a worker function, and an exception handler function. It shows how the worker function always returns a sensible result code in spite of bad parameters. Sample Program 2: worker.c

Notes about Sample Program 2:
 * The Killed Thread: The code in Sample Program 2 shows how to handle the killed thread case. Even though there are no killed threads in Sample Program 2, the technique is critical to exported worker functions in DLLs where the client process may use DosKillThread with abandon.
 * Nested Exceptions: At exception time, OS/2 inserts a handler at the head of the chain before it invokes the remaining handlers on the chain in order to detect nested exceptions. (A nested exception is one that occurs in an exception handler.) The IBM C Set/2 implementation of longjmp correctly unwinds the system's nested exception handler.
 * Sparse Allocations in OS/2: When there is no COMMIT option on the MEMMAN statement in CONFIG.SYS, OS/2 handles every memory allocation in a sparse manner similar to Sample Program 1. This technique is called lazy commit. When the COMMIT option is present on MEMMAN, commits are never deferred.

Future Considerations
Rest assured that this exception management strategy is portable to future versions of OS/2. It uses 32-bit APIs, ANSI C runtime routines, and no assembler code.