32-Bit OS/2 Exception Management
Under 16-bit OS/2 architecture, a process cannot handle access violations and certain other exceptions; the system invariably terminates the process. The only choice a program has is to register an exit-list function using the DosExitList() API. Then, at process-termination time, OS/2 calls each of the registered exit-list functions, and they perform cleanup before the termination of the process. This approach is process-granular. It allows for cleanup, but not recovery.
Under the 32-bit OS/2 environment, the approach is thread-granular. OS/2 keeps a chain of exception handler functions for every thread. When a thread causes an exception, OS/2 walks the chain and calls each of the functions until one reports "handled". If no function handles the exception, the system takes default action. For many exceptions, the default action is process termination.
The exception management APIs are new in the 32-bit OS/2 operating system. They are available to 32-bit executables and dynamic link libraries (DLLs). OS/2 designers intend for 32-bit exception management to be hardware-independent, to be a superset of traditional 16-bit exit-list processing, to encompass 16-bit signals, and to provide thread-granular recovery of exceptions.
Figure 1. Chain of Exception Registration Records. A pointer to the first record in the chain is stored in the thread information block (TIB) structure.
This article describes the following exception handler scenarios:
- A function recovers from the error and reports "handled" by returning XCPT_CONTINUE_EXECUTION. The function continues to execute.
- A function does not handle the exception and reports "not handled" by returning XCPT_CONTINUE_SEARCH. Other handlers in the chain get a chance to handle the exception.
- The third option is graceful failure. This approach is nicely suited for worker functions in EXEs and DLLs that must remain robust in spite of bad parameters or killed threads.
Contents
Adding a Handler to the Chain
Use the API DosSetExceptionHandler() to insert an exception handler for the calling thread. This API performs an insert-at-head operation; therefore, the last handler inserted is the first one called at exception time. It is quite possible for one handler to serve numerous threads, but each thread must call DosSetExceptionHandler().
The OS/2 Developer's Toolkit defines a exception registration record structure called EXCEPTIONREGISTRATIONRECORD, but you can define your own. See Figure 1. (More later on why that is a good thing to do.) The absolute minimum exception registration record is a structure that contains two 32-bit pointers: a pointer to the next exception registration record in the chain and a pointer to the handler function.
// Bare-bones exception registration record // See also \toolkt20\c\os2h\bsexcpt.h typedef struct _regrec { PVOID pNext; PFN pfnHandler; } REGREC; typedef REGREC *PREGREC; // A prototype for an exception handler function ULONG _System HandlerFunction( PEXCEPTIONREPORTRECORD p1, PREGREC p2, PCONTEXTRECOR D p3, PVOID p4 );
Figure 1. REGREC definition and handler function prototype
Assign the pointer regrec.pfnHandler then call the DosSetExceptionHandler() API. The system assigns regrec.pNext. See Figure 2.
REGREC regrec; ... regrec.pfnHandler = (PFN)HandlerFunction; rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD)®rec ); assert( 0 == rc );
Figure 2. Code fragment shows REGREC declaration and use.
Recoverable Exceptions
When an exception handler returns handled, the handler has recovered from the exception, and execution resumes at the point of the exception.
One scenario involving recoverable exceptions is NPX (80387) emulation. For example, compile a program with hardware floating-point instructions, and run it on a system without a floating-point coprocessor. Executing the floating-point instruction causes OS/2 to raise a coprocessor-not-available exception.
An exception handler emulates the floating-point instruction in software. In fact, this scenario describes one of OS/2's default exception handlers. Code compiled with floating-point instructions runs under OS/2 on systems without a math coprocessor.
Another scenario involves sparse allocation of memory. In 32-bit OS/2, DosAllocMem() allocates memory in a collection of 4K pages. (The size of every DosAllocMem allocation is always rounded up to the next higher multiple of 4K.) The pages within a memory allocation can have different attributes: notable ones are committed and invalid. The DosSetMem() API lets you commit individual pages within a memory allocation.
Sample Program 1 uses the DosSetMem() API in an exception handler to commit memory as it is referenced. The sample program allocates a memory object such that no pages are committed. Then, it writes to the memory. This causes a page fault, and the system delivers an exception to the handler. The handler commits the memory, returns handled, and the system restarts the instruction.
/* SPARSE.C. This program allocates a one MB memory object but commits no pages. The program then writes to that memory which is invalid, and this causes a trap. The handler commits the invalid page and resumes execution. Compile and link this program with: icc /Ss sparse.c */ // os2 includes #define INCL_DOS #define INCL_ERRORS #include <os2.h>
// c includes #include <stdio.h> #include <stdlib.h> #include <string.h> #include <assert.h> // Exception handler registration record typedef struct _regrec { PVOID pNext; PFN pfnHandler; } REGREC; typedef REGREC *PREGREC; // ---------------------------------------------------------------------- ULONG _System Handler( PEXCEPTIONREPORTRECORD p1, PREGREC p2, PCONTEXTRECORD p3, PVOID pv ) { // Interested in access violation if( p1->ExceptionNum == XCPT_ACCESS_VIOLATION ) { assert( p1->ExceptionInfo[0] == XCPT_WRITE_ACCESS ); // Try to commit the referenced page if( 0 == DosSetMem( (PVOID)p1->ExceptionInfo[1], 1, PAG_COMMIT|PAG_WRITE )) { // Successful commit; resume execution return XCPT_CONTINUE_EXECUTION; } } // Not handled, let other handlers in the chain have the exception return XCPT_CONTINUE_SEARCH; } // ---------------------------------------------------------------------- int main ( void ) { APIRET rc; PCHAR pchar; PSZ psz; PVOID pvBase; REGREC regrec; // Insert exception handler into the chain of handlers for this thread regrec.pfnHandler = (PFN)Handler; rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) ®rec ); assert( rc == 0 ); // Allocate a memory object without committing any of it; // Note lack of PAG_COMMIT flag rc = DosAllocMem( &pvBase, 1048576, PAG_WRITE ); assert( rc == 0 ); // This causes an exception since the page is not committed pchar = (PCHAR)pvBase; *pchar = 'a'; // This string copy causes two more exceptions psz = (PSZ)pvBase + (4096 + 4092); strcpy( psz, "This string crosses a 4K page boundary." ); // Reference the memory printf( "%c\n", *pchar ); printf( "%s\n", psz ); // Free memory object rc = DosFreeMem( pvBase ); assert( rc == 0 ); // Unlink handler before returning rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) ®rec ); assert( rc == 0 ); return 0; }
Sample Program 1: sparse.c
Graceful Failure - When Good Threads Go Bad
Some exceptions are not so easy to restart. Can an exception handler fix a bad pointer during a general protection fault? Probably not. Should an exception handler choose a new divisor after division by zero? No. The operation must fail - but gracefully.
Graceful failure is important to APIs. API worker functions must return sensible, failing result codes to the caller in error situations.
Worker functions use an exception handler like a safety net. If a thread goes bad while executing a function, the safety net is there to catch it. For the net to be in place, the worker function registers a handler at function entry and removes it at function exit. The overhead is small, and it is worth the robustness gained.
Getting There from Here
In Sample Program 1, OS/2 lifts the thread from the point of the exception, makes it call the exception handler, then drops it back on the faulting instruction. This is no good for graceful failure. Yes, it is desirable to jump back to the worker function, but not at the point of the exception!
Instead, the thread must jump from the exception handler function to a known point in the worker function. This is an interfunctional GOTO. Debates still rage about GOTO, but most programmers accept them when it comes to exception management.
Code interfunctional GOTO's in C, using setjmp() and longjmp(). Use setjmp() to record the state of the thread at the beginning of the worker function. Later, from the exception handler function, use longjmp() to return the thread to the saved state. State information is stored in a variable of type jmp_buf.
The exception handler function must have addressability to the jmp_buf to use it on the call to longjmp(). The stack frame of the worker function is the ideal place to hold the jmp_buf and the exception registration record. Also, a pointer to the except ion registration record is one of the parameters to the exception handler function. Therefore, the way for an exception handler function to get the address of a jmp_buf is to put a jmp_buf at the end of the exception registration record. See Figure 3.
// User-extended exception registration record typedef struct _regrec { PVOID pNext; PFN pfnHandler; jmp_buf jmpWorker; } REGREC; typedef REGREC *PREGREC;
Figure 3. Extended REGREC definition
Sample Program 2 consists of the main() function, a worker function, and an exception handler function. It shows how the worker function always returns a sensible result code in spite of bad parameters.
/* WORKER.C. This program shows how a worker function can use an exception */ /* handler like a safety net for calling threads. Compile and link this */ /* program with: icc /ss worker.c */ // os2 includes #define INCL_DOS #define INCL_ERRORS #include <os2.h> // c includes #include <stdio.h> #include <stdlib.h> #include <string.h> #include <setjmp.h> #include <assert.h> // User-extended exception registration record typedef struct _regrec { PVOID pNext; PFN pfnHandler; jmp_buf jmpWorker; } REGREC; typedef REGREC *PREGREC; // ---------------------------------------------------------------------- ULONG _System Handler( PEXCEPTIONREPORTRECORD p1, PREGREC p2, PCONTEXTRECORD p3, PVOID pv ) { switch( p1->ExceptionNum ) { case XCPT_ACCESS_VIOLATION: case XCPT_INTEGER_DIVIDE_BY_ZERO: case XCPT_INTEGER_OVERFLOW: case XCPT_PROCESS_TERMINATE: // Killed thread case case XCPT_ASYNC_PROCESS_TERMINATE: // Killed thread case // Interested in this one longjmp( p2->jmpWorker, p1->ExceptionNum ); default: break; } // Not handled return XCPT_CONTINUE_SEARCH; } // ---------------------------------------------------------------------- // Returns TRUE for success, FALSE for failure LONG _System WorkerFunction( PCHAR pch ) { LONG rc; LONG rcResult; ULONG ulException; REGREC regrec; // Set a handler regrec.pfnHandler = (PFN)Handler; rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) ®rec ); assert( rc == 0 ); // Store a known thread state ulException = setjmp( regrec.jmpWorker ); if( ulException ) { // Clean up here: free memory allocations, release mutex sems, etc. // Get the handler off the chain rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) ®rec ); assert( rc == 0 ); // Check for the killed-thread case switch( ulException ) { case XCPT_PROCESS_TERMINATE: case XCPT_ASYNC_PROCESS_TERMINATE: // Clean up done above and thread really wants to die DosExit( EXIT_THREAD, 0 ); break; } // Set a failing result code rcResult = FALSE; goto depart; } // Dereference the supplied pointer *pch = 'a'; rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) ®rec ); assert( rc == 0 ); rcResult = TRUE; depart: return rcResult; } // ---------------------------------------------------------------------- int main ( void ) { CHAR szWork[ 16 ]; LONG rc; // Try worker function with a good pointer rc = WorkerFunction( szWork ); printf( "Good pointer returns %d\n", rc ); // Try worker function with a bad pointer rc = WorkerFunction( NULL ); printf( "Bad pointer returns %d\n", rc ); return 0; }
Sample Program 2: worker.c
Notes about Sample Program 2:
- The Killed Thread: The code in Sample Program 2 shows how to handle the killed thread case. Even though there are no killed threads in Sample Program 2, the technique is critical to exported worker functions in DLLs where the client process may use DosKillThread with abandon.
- Nested Exceptions: At exception time, OS/2 inserts a handler at the head of the chain before it invokes the remaining handlers on the chain in order to detect nested exceptions. (A nested exception is one that occurs in an exception handler.) The IBM C Set/2 implementation of longjmp() correctly unwinds the system's nested exception handler.
- Sparse Allocations in OS/2: When there is no COMMIT option on the MEMMAN statement in CONFIG.SYS, OS/2 handles every memory allocation in a sparse manner similar to Sample Program 1. This technique is called lazy commit. When the COMMIT option is present on MEMMAN, commits are never deferred.
Future Considerations
Rest assured that this exception management strategy is portable to future versions of OS/2. It uses 32-bit APIs, ANSI C runtime routines, and no assembler code.
Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation