32-Bit OS/2 Exception Management

From EDM2
Jump to: navigation, search

by Monte Copeland

Under 16-bit OS/2 architecture, a process cannot handle access violations and certain other exceptions; the system invariably terminates the process. The only choice a program has is to register an exit-list function using the DosExitList() API. Then, at process-termination time, OS/2 calls each of the registered exit-list functions, and they perform cleanup before the termination of the process. This approach is process-granular. It allows for cleanup, but not recovery.

Under the 32-bit OS/2 environment, the approach is thread-granular. OS/2 keeps a chain of exception handler functions for every thread. When a thread causes an exception, OS/2 walks the chain and calls each of the functions until one reports "handled". If no function handles the exception, the system takes default action. For many exceptions, the default action is process termination.

The exception management APIs are new in the 32-bit OS/2 operating system. They are available to 32-bit executables and dynamic link libraries (DLLs). OS/2 designers intend for 32-bit exception management to be hardware-independent, to be a superset of traditional 16-bit exit-list processing, to encompass 16-bit signals, and to provide thread-granular recovery of exceptions.

32bits-Exp-Fig-1.gif

Figure 1. Chain of Exception Registration Records. A pointer to the first record in the chain is stored in the thread information block (TIB) structure.

This article describes the following exception handler scenarios:

  • A function recovers from the error and reports "handled" by returning XCPT_CONTINUE_EXECUTION. The function continues to execute.
  • A function does not handle the exception and reports "not handled" by returning XCPT_CONTINUE_SEARCH. Other handlers in the chain get a chance to handle the exception.
  • The third option is graceful failure. This approach is nicely suited for worker functions in EXEs and DLLs that must remain robust in spite of bad parameters or killed threads.

Adding a Handler to the Chain

Use the API DosSetExceptionHandler() to insert an exception handler for the calling thread. This API performs an insert-at-head operation; therefore, the last handler inserted is the first one called at exception time. It is quite possible for one handler to serve numerous threads, but each thread must call DosSetExceptionHandler().

The OS/2 Developer's Toolkit defines a exception registration record structure called EXCEPTIONREGISTRATIONRECORD, but you can define your own. See Figure 1. (More later on why that is a good thing to do.) The absolute minimum exception registration record is a structure that contains two 32-bit pointers: a pointer to the next exception registration record in the chain and a pointer to the handler function.

// Bare-bones exception registration record
// See also \toolkt20\c\os2h\bsexcpt.h
typedef struct _regrec {
        PVOID   pNext;
        PFN     pfnHandler;
} REGREC;
typedef REGREC *PREGREC;

// A prototype for an exception handler function
ULONG _System HandlerFunction( PEXCEPTIONREPORTRECORD           p1,
                               PREGREC                          p2,
                               PCONTEXTRECOR  D                 p3,
                               PVOID                            p4 );

Figure 1. REGREC definition and handler function prototype

Assign the pointer regrec.pfnHandler then call the DosSetExceptionHandler() API. The system assigns regrec.pNext. See Figure 2.

REGREC regrec;
...
regrec.pfnHandler = (PFN)HandlerFunction;
rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD)&regrec );
assert( 0 == rc );

Figure 2. Code fragment shows REGREC declaration and use.

Recoverable Exceptions

When an exception handler returns handled, the handler has recovered from the exception, and execution resumes at the point of the exception.

One scenario involving recoverable exceptions is NPX (80387) emulation. For example, compile a program with hardware floating-point instructions, and run it on a system without a floating-point coprocessor. Executing the floating-point instruction causes OS/2 to raise a coprocessor-not-available exception.

An exception handler emulates the floating-point instruction in software. In fact, this scenario describes one of OS/2's default exception handlers. Code compiled with floating-point instructions runs under OS/2 on systems without a math coprocessor.

Another scenario involves sparse allocation of memory. In 32-bit OS/2, DosAllocMem() allocates memory in a collection of 4K pages. (The size of every DosAllocMem allocation is always rounded up to the next higher multiple of 4K.) The pages within a memory allocation can have different attributes: notable ones are committed and invalid. The DosSetMem() API lets you commit individual pages within a memory allocation.

Sample Program 1 uses the DosSetMem() API in an exception handler to commit memory as it is referenced. The sample program allocates a memory object such that no pages are committed. Then, it writes to the memory. This causes a page fault, and the system delivers an exception to the handler. The handler commits the memory, returns handled, and the system restarts the instruction.

/* SPARSE.C.  This program allocates a one MB memory object but commits no pages.  
   The program then writes to that memory which is invalid, and this causes a trap.  
   The handler commits the invalid page and resumes execution.
   Compile and link this program with:  icc /Ss sparse.c                           */

// os2 includes
#define INCL_DOS
#define INCL_ERRORS
#include <os2.h>
// c includes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

// Exception handler registration record
typedef struct _regrec {
        PVOID pNext;
        PFN   pfnHandler;
        } REGREC;
typedef REGREC *PREGREC;

// ----------------------------------------------------------------------
ULONG _System Handler( PEXCEPTIONREPORTRECORD p1,
                       PREGREC p2,
                       PCONTEXTRECORD p3,
                       PVOID pv )
{
// Interested in access violation
 if( p1->ExceptionNum == XCPT_ACCESS_VIOLATION  ) {
   assert( p1->ExceptionInfo[0] == XCPT_WRITE_ACCESS );
   // Try to commit the referenced page
   if( 0 == DosSetMem( (PVOID)p1->ExceptionInfo[1], 1, PAG_COMMIT|PAG_WRITE )) {
     // Successful commit; resume execution
     return XCPT_CONTINUE_EXECUTION;
   }
 }
 // Not handled, let other handlers in the chain have the exception
 return XCPT_CONTINUE_SEARCH;
}

// ----------------------------------------------------------------------
int main ( void )
{
 APIRET      rc;
 PCHAR       pchar;
 PSZ         psz;
 PVOID       pvBase;
 REGREC      regrec;

 // Insert exception handler into the chain of handlers for this thread
 regrec.pfnHandler = (PFN)Handler;
 rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
 assert( rc == 0 );

 // Allocate a memory object without committing any of it;
 // Note lack of PAG_COMMIT flag
 rc = DosAllocMem(  &pvBase, 1048576, PAG_WRITE );
 assert( rc == 0 );

 // This causes an exception since the page is not committed
 pchar = (PCHAR)pvBase;
 *pchar = 'a';

 // This string copy causes two more exceptions
 psz = (PSZ)pvBase + (4096 + 4092);
 strcpy( psz, "This string crosses a 4K page boundary." );

 // Reference the memory
 printf( "%c\n", *pchar );
 printf( "%s\n", psz );

 // Free memory object
 rc = DosFreeMem( pvBase );
 assert( rc == 0 );

 // Unlink handler before returning
 rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
 assert( rc == 0 );

 return 0;
}

Sample Program 1: sparse.c

Graceful Failure - When Good Threads Go Bad

Some exceptions are not so easy to restart. Can an exception handler fix a bad pointer during a general protection fault? Probably not. Should an exception handler choose a new divisor after division by zero? No. The operation must fail - but gracefully.

Graceful failure is important to APIs. API worker functions must return sensible, failing result codes to the caller in error situations.

Worker functions use an exception handler like a safety net. If a thread goes bad while executing a function, the safety net is there to catch it. For the net to be in place, the worker function registers a handler at function entry and removes it at function exit. The overhead is small, and it is worth the robustness gained.

Getting There from Here

In Sample Program 1, OS/2 lifts the thread from the point of the exception, makes it call the exception handler, then drops it back on the faulting instruction. This is no good for graceful failure. Yes, it is desirable to jump back to the worker function, but not at the point of the exception!

Instead, the thread must jump from the exception handler function to a known point in the worker function. This is an interfunctional GOTO. Debates still rage about GOTO, but most programmers accept them when it comes to exception management.

Code interfunctional GOTO's in C, using setjmp() and longjmp(). Use setjmp() to record the state of the thread at the beginning of the worker function. Later, from the exception handler function, use longjmp() to return the thread to the saved state. State information is stored in a variable of type jmp_buf.

The exception handler function must have addressability to the jmp_buf to use it on the call to longjmp(). The stack frame of the worker function is the ideal place to hold the jmp_buf and the exception registration record. Also, a pointer to the except ion registration record is one of the parameters to the exception handler function. Therefore, the way for an exception handler function to get the address of a jmp_buf is to put a jmp_buf at the end of the exception registration record. See Figure 3.

// User-extended exception registration record
typedef struct _regrec {
        PVOID           pNext;
        PFN             pfnHandler;
        jmp_buf         jmpWorker;
} REGREC;
typedef REGREC *PREGREC;

Figure 3. Extended REGREC definition

Sample Program 2 consists of the main() function, a worker function, and an exception handler function. It shows how the worker function always returns a sensible result code in spite of bad parameters.

/* WORKER.C.  This program shows how a worker function can use an exception */
/* handler  like a safety net for calling threads. Compile and link this    */
/* program with:  icc /ss worker.c                                          */

// os2 includes
#define INCL_DOS
#define INCL_ERRORS
#include <os2.h>

// c includes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <setjmp.h>
#include <assert.h>

// User-extended exception registration record
typedef struct _regrec {
 PVOID         pNext;
 PFN           pfnHandler;
 jmp_buf       jmpWorker;
} REGREC;
typedef REGREC *PREGREC;

// ----------------------------------------------------------------------
ULONG _System Handler( PEXCEPTIONREPORTRECORD p1,
                       PREGREC p2,
                       PCONTEXTRECORD p3,
                       PVOID pv )
{
 switch( p1->ExceptionNum ) {
 case XCPT_ACCESS_VIOLATION:
 case XCPT_INTEGER_DIVIDE_BY_ZERO:
 case XCPT_INTEGER_OVERFLOW:
 case XCPT_PROCESS_TERMINATE:           // Killed thread case
 case XCPT_ASYNC_PROCESS_TERMINATE:     // Killed thread case
   // Interested in this one
   longjmp( p2->jmpWorker, p1->ExceptionNum );
 default:
   break;
 }
 // Not handled
 return XCPT_CONTINUE_SEARCH;
}


// ----------------------------------------------------------------------
// Returns TRUE for success, FALSE for failure
LONG _System WorkerFunction( PCHAR pch )
{
 LONG        rc;
 LONG        rcResult;
 ULONG       ulException;
 REGREC      regrec;

 // Set a handler
 regrec.pfnHandler = (PFN)Handler;
 rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
 assert( rc == 0 );

 // Store a known thread state
 ulException = setjmp( regrec.jmpWorker );

 if( ulException ) {

   // Clean up here: free memory allocations, release mutex sems, etc.

   // Get the handler off the chain
   rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
   assert( rc == 0 );

   // Check for the killed-thread case
   switch( ulException ) {
   case XCPT_PROCESS_TERMINATE:
   case XCPT_ASYNC_PROCESS_TERMINATE:
     // Clean up done above and thread really wants to die
     DosExit( EXIT_THREAD, 0 );
     break;
   }
   // Set a failing result code
   rcResult = FALSE;
   goto depart;
 }

 // Dereference the supplied pointer
 *pch = 'a';

 rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
 assert( rc == 0 );

 rcResult = TRUE;

depart:
 return rcResult;
}

// ----------------------------------------------------------------------
int main ( void )
{
 CHAR     szWork[ 16 ];
 LONG     rc;

 // Try worker function with a good pointer
 rc = WorkerFunction( szWork );
 printf( "Good pointer returns %d\n", rc );

 // Try worker function with a bad pointer
 rc = WorkerFunction( NULL );
 printf( "Bad pointer returns %d\n", rc );

 return 0;
}

Sample Program 2: worker.c

Notes about Sample Program 2:

  • The Killed Thread: The code in Sample Program 2 shows how to handle the killed thread case. Even though there are no killed threads in Sample Program 2, the technique is critical to exported worker functions in DLLs where the client process may use DosKillThread with abandon.
  • Nested Exceptions: At exception time, OS/2 inserts a handler at the head of the chain before it invokes the remaining handlers on the chain in order to detect nested exceptions. (A nested exception is one that occurs in an exception handler.) The IBM C Set/2 implementation of longjmp() correctly unwinds the system's nested exception handler.
  • Sparse Allocations in OS/2: When there is no COMMIT option on the MEMMAN statement in CONFIG.SYS, OS/2 handles every memory allocation in a sparse manner similar to Sample Program 1. This technique is called lazy commit. When the COMMIT option is present on MEMMAN, commits are never deferred.

Future Considerations

Rest assured that this exception management strategy is portable to future versions of OS/2. It uses 32-bit APIs, ANSI C runtime routines, and no assembler code.

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation