Jump to content

32-Bit OS/2 Exception Management: Difference between revisions

From EDM2
Created page with "by Monte Copeland Memory management in OS/2 has many faces: 32-bit, 16-bit, and real mode. This article discusses programming the 32-bit memory model of OS/2. It covers memo..."
 
No edit summary
Line 1: Line 1:
by [[Monte Copeland]]
by [[Monte Copeland]]


Memory management in OS/2 has many faces: 32-bit, 16-bit, and real mode. This article discusses programming the 32-bit memory model of OS/2. It covers memory addressing, allocating, and heap management, as well as memory leaks and how to debug them. The DevCon for OS/2 CD-ROMs contain sample C code to illustrate these memory management concepts.  
Under 16-bit OS/2 architecture, a process cannot handle access violations and certain other exceptions; the system invariably terminates the process. The only choice a program has is to register an exit-list function using the DosExitList() API. Then, at process-termination time, OS/2 calls each of the registered exit-list functions, and they perform cleanup before the termination of the process. This approach is process-granular. It allows for cleanup, but not recovery.


==Process Address Space==
Under the 32-bit OS/2 environment, the approach is thread-granular. OS/2 keeps a chain of exception handler functions for every thread. When a thread causes an exception, OS/2 walks the chain and calls each of the functions until one reports "handled". If no function handles the exception, the system takes default action. For many exceptions, the default action is process termination.


When a process starts, OS/2 prepares a virtualized array of RAM called the process address space. Within this space, OS/2 allocates memory for .EXE and .DLL code and data. A program accesses this space with a 32-bits-wide address. The smallest address is usually 64KB, which is the base load address for .EXEs. The largest address is 512MB, the limit of OS/2 virtual address space (see Figure 1).
The exception management APIs are new in the 32-bit OS/2 operating system. They are available to 32-bit executables and dynamic link libraries (DLLs). OS/2 designers intend for 32-bit exception management to be hardware-independent, to be a superset of traditional 16-bit exit-list processing, to encompass 16-bit signals, and to provide thread-granular recovery of exceptions.


[[Image:32bitMem-fig-1.gif]]
[[Image:32bits-Exp-Fig-1.gif]]


Figure 1. Diagram of a simple .EXE loaded into memory. In this example, .EXE code and data are loaded low. Shared .DLL code and data are loaded high.
Figure 1. Chain of Exception Registration Records. A pointer to the first record in the chain is stored in the thread information block (TIB)
structure.


Private memory resides at low addresses. Only the owning process can access this memory. Private allocations start low and increase upwards. On the other hand, shared memory is allocated high and works downward.
This article describes the following exception handler scenarios:


OS/2 divides memory into pages that are 4KB in size. Each process has a set of page tables that maps its virtual memory to physical RAM. Each 4KB page has attributes including read/write, read-only, private, shared, committed, and guard.  
* A function recovers from the error and reports "handled" by returning XCPT_CONTINUE_EXECUTION. The function continues to execute.
* A function does not handle the exception and reports "not handled" by returning XCPT_CONTINUE_SEARCH. Other handlers in the chain get a chance to handle the exception.
* The third option is graceful failure. This approach is nicely suited for worker functions in EXEs and DLLs that must remain robust in spite of bad parameters or killed threads.


==Stack Memory==
==Adding a Handler to the Chain==


It's a good idea to create threads with at least a 32KB stack; OS/2 only uses what it needs to run the thread. Here's how:
Use the API DosSetExceptionHandler() to insert an exception handler for the calling thread. This API performs an insert-at-head operation; therefore, the last handler inserted is the first one called at exception time. It is quite possible for one handler to serve numerous threads, but each thread must call DosSetExceptionHandler().


When OS/2 allocates memory for a thread stack, it commits the top page and sets the guard attribute on the page below it. If stack usage exceeds 4KB, a guard-page exception occurs. OS/2 handles this exception: it commits the guard page and sets the next lower page to guard. Using this scheme, OS/2 commits only the pages a thread really needs.
The OS/2 Developer's Toolkit defines a exception registration record structure called EXCEPTIONREGISTRATIONRECORD, but you can define your own. See Figure 1. (More later on why that is a good thing to do.) The absolute minimum exception registration record is a structure that contains two 32-bit pointers: a pointer to the next exception registration record in the chain and a pointer to the handler function.
<pre>
// Bare-bones exception registration record
// See also \toolkt20\c\os2h\bsexcpt.h
typedef struct _regrec {
        PVOID  pNext;
        PFN    pfnHandler;
} REGREC;
typedef REGREC *PREGREC;


A trap can occur using automatic (stack) variables larger than 4KB. For example, assume that an 8KB array spans a guard page. If the program writes element zero first, the program will trap because it skipped guard page processing. Some compilers, including IBM's, generate code to touch each page in large automatic variables. For guard page processing to work, the code must touch pages starting at high addresses and work down.
// A prototype for an exception handler function
ULONG _System HandlerFunction( PEXCEPTIONREPORTRECORD          p1,
                              PREGREC                          p2,
                              PCONTEXTRECOR  D                p3,
                              PVOID                            p4 );
</pre>
'''Figure 1.''' REGREC definition and handler function prototype


==VisualAge C++ Compiler Data Pragma==
Assign the pointer regrec.pfnHandler then call the DosSetExceptionHandler() API. The system assigns regrec.pNext. See Figure 2.
<pre>
REGREC regrec;
. . .
regrec.pfnHandler = (PFN)HandlerFunction;
rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD)&regrec );
assert( 0 == rc );
</pre>
'''Figure 2.''' Code fragment shows REGREC declaration and use.


DosAllocMem and DosAllocSharedMem are not the only ways to allocate memory. The compiler, linker, and loader do a great deal to allocate memory and initialize it. The creator of a .DLL must decide where data belongs: shared versus private memory. The VisualAge C++ compiler has a data_seg pragma that takes a single argument--the name of a memory segment defined in the module definitions (.DEF) file.
==Recoverable Exceptions==


In Sample Code 1, the compiler ensures that the char array szBuffer resides in a memory segment named PIECE_1.
When an exception handler returns handled, the handler has recovered from the exception, and execution resumes at the point of the exception.


#pragma data_seg( PIECE_1 )
One scenario involving recoverable exceptions is NPX (80387) emulation. For example, compile a program with hardware floating-point instructions, and run it on a system without a floating-point coprocessor. Executing the floating-point instruction causes OS/2 to raise a coprocessor-not-available exception.
char szBuffer[ 256 ];


Sample Code 1. Excerpt from a C program that uses the data_seg pragma to place the static variable szBuffer onto memory segment PIECE_1.
An exception handler emulates the floating-point instruction in software. In fact, this scenario describes one of OS/2's default exception handlers. Code compiled with floating-point instructions runs under OS/2 on systems without a math coprocessor.


In the .DEF file, the segment PIECE_1 is a shared segment as follows:
Another scenario involves sparse allocation of memory. In 32-bit OS/2, DosAllocMem() allocates memory in a collection of 4K pages. (The size of every DosAllocMem allocation is always rounded up to the next higher multiple of 4K.) The pages within a memory allocation can have different attributes: notable ones are committed and invalid. The DosSetMem() API lets you commit individual pages within a memory allocation.


  SEGMENTS
Sample Program 1 uses the DosSetMem() API in an exception handler to commit memory as it is referenced. The sample program allocates a memory object such that no pages are committed. Then, it writes to the memory. This causes a page fault, and the system delivers an exception to the handler. The handler commits the memory, returns handled, and the system restarts the instruction.
  PIECE_1 CLASS 'DATA' SHARED
<PRE>
  PIECE_2 CLASS 'DATA' NONSHARED
/* SPARSE.C.  This program allocates a one MB memory object but commits no pages. 
The program then writes to that memory which is invalid, and this causes a trap.  
  The handler commits the invalid page and resumes execution.
/* Compile and link this program with: icc /Ss sparse.c                          */


Sample Code 2. Excerpt from a .DEF file that makes the PIECE_1 segment a shared memory object.
// os2 includes
#define INCL_DOS
#define INCL_ERRORS
#include <os2.h>


Programmers coding a .DLL use data_seg() pragmas and .DEF file SEGMENTS statements to control which variables are private per process and which are shared. Reference data and read-only data usually need a single copy in memory, so you should place these in shared memory.
// c includes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>


This technique is not limited to .DLLs. Programmers who expect to have multiple copies of the same .EXE running simultaneously can do this, too. See \SOURCE\DEVNEWS\VOL10\MEM\PRAG.C on disc 1 of your accompanying DevCon for OS/2 CD-ROMs for more detail on implementing this approach.
// Exception handler registration record
typedef struct _regrec {
  PVOID pNext;
  PFN  pfnHandler;
} REGREC;
typedef REGREC *PREGREC;


==Managing Contention for Shared Memory==
// ----------------------------------------------------------------------
ULONG _System Handler( PEXCEPTIONREPORTRECORD p1,
                      PREGREC p2,
                      PCONTEXTRECORD p3,
                      PVOID pv )
{
  // Interested in access violation
  if( p1->ExceptionNum == XCPT_ACCESS_VIOLATION  ) {
    assert( p1->ExceptionInfo[0] == XCPT_WRITE_ACCESS );
    // Try to commit the referenced page
    if( 0 == DosSetMem( (PVOID)p1->ExceptionInfo[1], 1, PAG_COMMIT|PAG_WRITE )) {
      // Successful commit; resume execution
      return XCPT_CONTINUE_EXECUTION;
    }
  }
  // Not handled, let other handlers in the chain have the exception
  return XCPT_CONTINUE_SEARCH;
}


Whenever multiple processes read and write shared memory, you must manage the contention. This is best done with a named (therefore, shared) mutual exclusion semaphore. Under OS/2, anonymous semaphores are not suited for this task. Here's why:
// ----------------------------------------------------------------------
int main ( void )
{
  APIRET      rc;
  PCHAR      pchar;
  PSZ        psz;
  PVOID      pvBase;
  REGREC      regrec;


Assume you placed an anonymous semaphore handle onto shared memory. The first time your .EXE or .DLL loads, it creates the mutex semaphore. The thread tests the semaphore handle; if zero, it calls DosCreateMutexSem. But this is flawed logic! The test for a null semaphore handle is itself a reference to shared memory and must be protected by a semaphore. This logic works most of the time, but can fail in a race condition!
  // Insert exception handler into the chain of handlers for this thread
  regrec.pfnHandler = (PFN)Handler;
  rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
  assert( rc == 0 );


Named mutex semaphores don't have this problem. See \SOURCE\DEVNEWS\VOL10\MEM\TESTER.C on your DevCon for OS/2 CD-ROMs for more detail on this implementation.
  // Allocate a memory object without committing any of it;
  // Note lack of PAG_COMMIT flag
  rc = DosAllocMem(  &pvBase, 1048576, PAG_WRITE );
  assert( rc == 0 );


==Heaps for Small Allocations==
  // This causes an exception since the page is not committed
  pchar = (PCHAR)pvBase;
  *pchar = 'a';


The DosAllocMem API rounds up the allocation size to the nearest page boundary. For example, DosAllocMem rounds up a 100-byte allocation to 4096 bytes. Thus, DosAllocMem is not the right choice for many small allocations.
  // This string copy causes two more exceptions
  psz = (PSZ)pvBase + (4096 + 4092);
  strcpy( psz, "This string crosses a 4K page boundary." );


Small allocations require a heap. If you are programming in C, use the heap manager provided with your compiler (for example, new, delete, malloc, strdup, free).
  // Reference the memory
  printf( "%c\n", *pchar );
  printf( "%s\n", psz );


OS/2 also has a heap manager in the DosSubAllocMem and DosSubFreeMem suballocation APIs (see Sample Code 3).
  // Free memory object
  rc = DosFreeMem( pvBase );
  assert( rc == 0 );


define LEN_HEAP 0x20000
  // Unlink handler before returning
PVOID pvHeap;
  rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
APIRET rc;
  assert( rc == 0 );
rc = DosAllocMem( &pvHeap, LEN_HEAP, PAG_WRITE );
assert( rc == 0 );
rc = DosSubSetMem( pvHeap, DOSSUB_INIT | DOSSUB_SPARSE_OBJ, LEN_HEAP );
assert( rc == 0 );


Sample Code 3. Code to prepare a suballocated heap. It is a sparse heap; OS/2 will commit pages as needed. For best results, subset the entire allocation and avoid the "grow" option. See the source code located in \SOURCE\DEVNEWS\VOL10\MEM\HEAP.C on disc 1 of your DevCon for OS/2 CD-ROMs.
  return 0;
}</PRE>


Programmers often put function wrappers around DosSubAllocMem and DosSubFreeMem for convenience. The following allocation wrapper allocates a little extra space in order to store the heap base pointer and the suballocation size:
'''Sample Program 1:''' sparse.c


PVOID APIENTRY myalloc( PVOID pvBase, ULONG ulSize );
==Graceful Failure - When Good Threads Go Bad==


It returns a pointer to the suballocated memory. The following free wrapper uses pv to retrieve the base pointer and size, then it calls DosSubFreeMem:
Some exceptions are not so easy to restart. Can an exception handler fix a bad pointer during a general protection fault? Probably not. Should an exception handler choose a new divisor after division by zero? No. The operation must fail - but gracefully.


PVOID APIENTRY myfree( PVOID pv );
Graceful failure is important to APIs. API worker functions must return sensible, failing result codes to the caller in error situations.


See \SOURCE\DEVNEWS\VOL10\MEM\HEAP.C on disc 1 of your DevCon for OS/2 CD-ROMs for sample code.  
Worker functions use an exception handler like a safety net. If a thread goes bad while executing a function, the safety net is there to catch it. For the net to be in place, the worker function registers a handler at function entry and removes it at function exit. The overhead is small, and it is worth the robustness gained.  


==Out of Memory==
==Getting There from Here==


If a process runs out of memory, the problem is usually due to a memory leak or disk full condition.
In Sample Program 1, OS/2 lifts the thread from the point of the exception, makes it call the exception handler, then drops it back on the faulting instruction. This is no good for graceful failure. Yes, it is desirable to jump back to the worker function, but not at the point of the exception!


If SWAPPER.DAT grows until it fills the disk, then requests for committed, read/write memory have exceeded disk capacity. First, point the swapper to a larger disk. If it fails again, there is probably a memory leak.
Instead, the thread must jump from the exception handler function to a known point in the worker function. This is an interfunctional GOTO. Debates still rage about GOTO, but most programmers accept them when it comes to exception management.


A memory leak is a program error--a program allocates memory and fails to free it. A program that leaks memory is like a time bomb: it's a matter of time before the program will fail.
Code interfunctional GOTO's in C, using setjmp() and longjmp(). Use setjmp() to record the state of the thread at the beginning of the worker function. Later, from the exception handler function, use longjmp() to return the thread to the saved state. State information is stored in a variable of type jmp_buf.


The productivity tool 20MEMU, which you can install from the "Productivity Tools" category in the Developer Connection for OS/2 catalog, reports on memory usage. 20MEMU helps to detect and debug memory leaks.
The exception handler function must have addressability to the jmp_buf to use it on the call to longjmp(). The stack frame of the worker function is the ideal place to hold the jmp_buf and the exception registration record. Also, a pointer to the except ion registration record is one of the parameters to the exception handler function. Therefore, the way for an exception handler function to get the address of a jmp_buf is to put a jmp_buf at the end of the exception registration record. See Figure 3.
<PRE>
// User-extended exception registration record
typedef struct _regrec {
        PVOID          pNext;
        PFN            pfnHandler;
        jmp_buf        jmpWorker;
} REGREC;
typedef REGREC *PREGREC;
</PRE>
Figure 3. Extended REGREC definition


The first panel reports memory usage for the system. To report on a certain process, enter the process ID and press Enter. The program reports on process private memory, process shared memory, and operating system shared memory.
Sample Program 2 consists of the main() function, a worker function, and an exception handler function. It shows how the worker function always returns a sensible result code in spite of bad parameters.
<PRE>
/* WORKER.C. This program shows how a worker function can use an exception */
/* handler  like a safety net for calling threads. Compile and link this    */
/* program with:  icc /ss worker.c                                          */


To detect a leak, take "snapshots" of memory usage at regular intervals. If the list of memory objects grows and never shrinks, there is a leak. Use the virtual addresses and your debugger to track it down.  
// os2 includes
#define INCL_DOS
#define INCL_ERRORS
#include <os2.h>


==Lock-Proof Memory==
// c includes
An OS/2 physical device driver (PDD) will "lock down" memory during I/O, so it won't be paged to disk. Some drivers have problems locking memory buffers allocated from heaps. The write fails and returns result code 5.
#include <stdio.h>
The solution is to allocate and commit a memory buffer using DosAllocMem. Use this buffer to pass data to the PDD.  
#include <stdlib.h>
#include <string.h>
#include <setjmp.h>
#include <assert.h>


==Page Tuning==
// User-extended exception registration record
typedef struct _regrec {
  PVOID        pNext;
  PFN          pfnHandler;
  jmp_buf      jmpWorker;
} REGREC;
typedef REGREC *PREGREC;


Page tuning is the act of identifying functions with high interaction, then placing those functions near each other in memory. This reduces the working set; fewer pages are needed to perform a task, resulting in less paging and better performance.
// ----------------------------------------------------------------------
ULONG _System Handler( PEXCEPTIONREPORTRECORD p1,
                      PREGREC p2,
                      PCONTEXTRECORD p3,
                      PVOID pv )
{
  switch( p1->ExceptionNum ) {
  case XCPT_ACCESS_VIOLATION:
  case XCPT_INTEGER_DIVIDE_BY_ZERO:
  case XCPT_INTEGER_OVERFLOW:
  case XCPT_PROCESS_TERMINATE:          // Killed thread case
  case XCPT_ASYNC_PROCESS_TERMINATE:    // Killed thread case
    // Interested in this one
    longjmp( p2->jmpWorker, p1->ExceptionNum );
  default:
    break;
  }
  // Not handled
  return XCPT_CONTINUE_SEARCH;
}


Placing a function in memory requires the help of your compiler. The VisualAge C++ compiler supports the pragma alloc_text. In the following example, the compiler places function _DLL_InitTerm in the CODE1 code segment:


#pragma alloc_text( CODE1, _DLL_InitTerm )
// ----------------------------------------------------------------------
// Returns TRUE for success, FALSE for failure
LONG _System WorkerFunction( PCHAR pch )
{
  LONG        rc;
  LONG        rcResult;
  ULONG      ulException;
  REGREC      regrec;


An .H file included by all C sources is a good place to code alloc_text pragmas. Manual page tuning is possible, but requires great familiarity with the code. Profiler tools automate the process because they provide graphic representations of execution as well as working set page counts.  
  // Set a handler
  regrec.pfnHandler = (PFN)Handler;
  rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
  assert( rc == 0 );
 
  // Store a known thread state
  ulException = setjmp( regrec.jmpWorker );
 
  if( ulException ) {
 
    // Clean up here: free memory allocations, release mutex sems, etc.
 
    // Get the handler off the chain
    rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
    assert( rc == 0 );
 
    // Check for the killed-thread case
    switch( ulException ) {
    case XCPT_PROCESS_TERMINATE:
    case XCPT_ASYNC_PROCESS_TERMINATE:
      // Clean up done above and thread really wants to die
      DosExit( EXIT_THREAD, 0 );
      break;
    }
    // Set a failing result code
    rcResult = FALSE;
    goto depart;
  }
 
  // Dereference the supplied pointer
  *pch = 'a';
 
  rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
  assert( rc == 0 );
 
  rcResult = TRUE;
 
depart:
  return rcResult;
}
 
 
// ----------------------------------------------------------------------
int main ( void )
{
  CHAR    szWork[ 16 ];
  LONG    rc;
 
  // Try worker function with a good pointer
  rc = WorkerFunction( szWork );
  printf( "Good pointer returns %d\n", rc );
 
  // Try worker function with a bad pointer
  rc = WorkerFunction( NULL );
  printf( "Bad pointer returns %d\n", rc );
 
  return 0;
}
</PRE>
Sample Program 2: worker.c
 
Notes about Sample Program 2:
 
* The Killed Thread: The code in Sample Program 2 shows how to handle the killed thread case. Even though there are no killed threads in Sample Program 2, the technique is critical to exported worker functions in DLLs where the client process may use DosKillThread with abandon.
 
*  Nested Exceptions: At exception time, OS/2 inserts a handler at the head of the chain before it invokes the remaining handlers on the chain in order to detect nested exceptions. (A nested exception is one that occurs in an exception handler.) The IBM C Set/2 implementation of longjmp() correctly unwinds the system's nested exception handler.
 
* Sparse Allocations in OS/2: When there is no COMMIT option on the MEMMAN statement in CONFIG.SYS, OS/2 handles every memory allocation in a sparse manner similar to Sample Program 1. This technique is called lazy commit. When the COMMIT option is present on MEMMAN, commits are never deferred.
 
==Future Considerations==
 
Rest assured that this exception management strategy is portable to future versions of OS/2. It uses 32-bit APIs, ANSI C runtime routines, and no assembler code.  




'''Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation'''
'''Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation'''
[[Category:Article]]

Revision as of 00:08, 19 June 2013

by Monte Copeland

Under 16-bit OS/2 architecture, a process cannot handle access violations and certain other exceptions; the system invariably terminates the process. The only choice a program has is to register an exit-list function using the DosExitList() API. Then, at process-termination time, OS/2 calls each of the registered exit-list functions, and they perform cleanup before the termination of the process. This approach is process-granular. It allows for cleanup, but not recovery.

Under the 32-bit OS/2 environment, the approach is thread-granular. OS/2 keeps a chain of exception handler functions for every thread. When a thread causes an exception, OS/2 walks the chain and calls each of the functions until one reports "handled". If no function handles the exception, the system takes default action. For many exceptions, the default action is process termination.

The exception management APIs are new in the 32-bit OS/2 operating system. They are available to 32-bit executables and dynamic link libraries (DLLs). OS/2 designers intend for 32-bit exception management to be hardware-independent, to be a superset of traditional 16-bit exit-list processing, to encompass 16-bit signals, and to provide thread-granular recovery of exceptions.

Figure 1. Chain of Exception Registration Records. A pointer to the first record in the chain is stored in the thread information block (TIB) structure.

This article describes the following exception handler scenarios:

  • A function recovers from the error and reports "handled" by returning XCPT_CONTINUE_EXECUTION. The function continues to execute.
  • A function does not handle the exception and reports "not handled" by returning XCPT_CONTINUE_SEARCH. Other handlers in the chain get a chance to handle the exception.
  • The third option is graceful failure. This approach is nicely suited for worker functions in EXEs and DLLs that must remain robust in spite of bad parameters or killed threads.

Adding a Handler to the Chain

Use the API DosSetExceptionHandler() to insert an exception handler for the calling thread. This API performs an insert-at-head operation; therefore, the last handler inserted is the first one called at exception time. It is quite possible for one handler to serve numerous threads, but each thread must call DosSetExceptionHandler().

The OS/2 Developer's Toolkit defines a exception registration record structure called EXCEPTIONREGISTRATIONRECORD, but you can define your own. See Figure 1. (More later on why that is a good thing to do.) The absolute minimum exception registration record is a structure that contains two 32-bit pointers: a pointer to the next exception registration record in the chain and a pointer to the handler function.

// Bare-bones exception registration record
// See also \toolkt20\c\os2h\bsexcpt.h
typedef struct _regrec {
        PVOID   pNext;
        PFN     pfnHandler;
} REGREC;
typedef REGREC *PREGREC;

// A prototype for an exception handler function
ULONG _System HandlerFunction( PEXCEPTIONREPORTRECORD           p1,
                               PREGREC                          p2,
                               PCONTEXTRECOR  D                 p3,
                               PVOID                            p4 );

Figure 1. REGREC definition and handler function prototype

Assign the pointer regrec.pfnHandler then call the DosSetExceptionHandler() API. The system assigns regrec.pNext. See Figure 2.

REGREC regrec;
. . .
regrec.pfnHandler = (PFN)HandlerFunction;
rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD)&regrec );
assert( 0 == rc );

Figure 2. Code fragment shows REGREC declaration and use.

Recoverable Exceptions

When an exception handler returns handled, the handler has recovered from the exception, and execution resumes at the point of the exception.

One scenario involving recoverable exceptions is NPX (80387) emulation. For example, compile a program with hardware floating-point instructions, and run it on a system without a floating-point coprocessor. Executing the floating-point instruction causes OS/2 to raise a coprocessor-not-available exception.

An exception handler emulates the floating-point instruction in software. In fact, this scenario describes one of OS/2's default exception handlers. Code compiled with floating-point instructions runs under OS/2 on systems without a math coprocessor.

Another scenario involves sparse allocation of memory. In 32-bit OS/2, DosAllocMem() allocates memory in a collection of 4K pages. (The size of every DosAllocMem allocation is always rounded up to the next higher multiple of 4K.) The pages within a memory allocation can have different attributes: notable ones are committed and invalid. The DosSetMem() API lets you commit individual pages within a memory allocation.

Sample Program 1 uses the DosSetMem() API in an exception handler to commit memory as it is referenced. The sample program allocates a memory object such that no pages are committed. Then, it writes to the memory. This causes a page fault, and the system delivers an exception to the handler. The handler commits the memory, returns handled, and the system restarts the instruction.

/* SPARSE.C.  This program allocates a one MB memory object but commits no pages.  
 The program then writes to that memory which is invalid, and this causes a trap.  
 The handler commits the invalid page and resumes execution.
/* Compile and link this program with:  icc /Ss sparse.c                           */

// os2 includes
#define INCL_DOS
#define INCL_ERRORS
#include <os2.h>

// c includes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

// Exception handler registration record
typedef struct _regrec {
  PVOID pNext;
  PFN   pfnHandler;
} REGREC;
typedef REGREC *PREGREC;

// ----------------------------------------------------------------------
ULONG _System Handler( PEXCEPTIONREPORTRECORD p1,
                       PREGREC p2,
                       PCONTEXTRECORD p3,
                       PVOID pv )
{
  // Interested in access violation
  if( p1->ExceptionNum == XCPT_ACCESS_VIOLATION  ) {
    assert( p1->ExceptionInfo[0] == XCPT_WRITE_ACCESS );
    // Try to commit the referenced page
    if( 0 == DosSetMem( (PVOID)p1->ExceptionInfo[1], 1, PAG_COMMIT|PAG_WRITE )) {
      // Successful commit; resume execution
      return XCPT_CONTINUE_EXECUTION;
    }
  }
  // Not handled, let other handlers in the chain have the exception
  return XCPT_CONTINUE_SEARCH;
}

// ----------------------------------------------------------------------
int main ( void )
{
  APIRET      rc;
  PCHAR       pchar;
  PSZ         psz;
  PVOID       pvBase;
  REGREC      regrec;

  // Insert exception handler into the chain of handlers for this thread
  regrec.pfnHandler = (PFN)Handler;
  rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
  assert( rc == 0 );

  // Allocate a memory object without committing any of it;
  // Note lack of PAG_COMMIT flag
  rc = DosAllocMem(  &pvBase, 1048576, PAG_WRITE );
  assert( rc == 0 );

  // This causes an exception since the page is not committed
  pchar = (PCHAR)pvBase;
  *pchar = 'a';

  // This string copy causes two more exceptions
  psz = (PSZ)pvBase + (4096 + 4092);
  strcpy( psz, "This string crosses a 4K page boundary." );

  // Reference the memory
  printf( "%c\n", *pchar );
  printf( "%s\n", psz );

  // Free memory object
  rc = DosFreeMem( pvBase );
  assert( rc == 0 );

  // Unlink handler before returning
  rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
  assert( rc == 0 );

  return 0;
}

Sample Program 1: sparse.c

Graceful Failure - When Good Threads Go Bad

Some exceptions are not so easy to restart. Can an exception handler fix a bad pointer during a general protection fault? Probably not. Should an exception handler choose a new divisor after division by zero? No. The operation must fail - but gracefully.

Graceful failure is important to APIs. API worker functions must return sensible, failing result codes to the caller in error situations.

Worker functions use an exception handler like a safety net. If a thread goes bad while executing a function, the safety net is there to catch it. For the net to be in place, the worker function registers a handler at function entry and removes it at function exit. The overhead is small, and it is worth the robustness gained.

Getting There from Here

In Sample Program 1, OS/2 lifts the thread from the point of the exception, makes it call the exception handler, then drops it back on the faulting instruction. This is no good for graceful failure. Yes, it is desirable to jump back to the worker function, but not at the point of the exception!

Instead, the thread must jump from the exception handler function to a known point in the worker function. This is an interfunctional GOTO. Debates still rage about GOTO, but most programmers accept them when it comes to exception management.

Code interfunctional GOTO's in C, using setjmp() and longjmp(). Use setjmp() to record the state of the thread at the beginning of the worker function. Later, from the exception handler function, use longjmp() to return the thread to the saved state. State information is stored in a variable of type jmp_buf.

The exception handler function must have addressability to the jmp_buf to use it on the call to longjmp(). The stack frame of the worker function is the ideal place to hold the jmp_buf and the exception registration record. Also, a pointer to the except ion registration record is one of the parameters to the exception handler function. Therefore, the way for an exception handler function to get the address of a jmp_buf is to put a jmp_buf at the end of the exception registration record. See Figure 3.

// User-extended exception registration record
typedef struct _regrec {
        PVOID           pNext;
        PFN             pfnHandler;
        jmp_buf         jmpWorker;
} REGREC;
typedef REGREC *PREGREC;

Figure 3. Extended REGREC definition

Sample Program 2 consists of the main() function, a worker function, and an exception handler function. It shows how the worker function always returns a sensible result code in spite of bad parameters.

/* WORKER.C.  This program shows how a worker function can use an exception */
/* handler  like a safety net for calling threads. Compile and link this    */
/* program with:  icc /ss worker.c                                          */

// os2 includes
#define INCL_DOS
#define INCL_ERRORS
#include <os2.h>

// c includes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <setjmp.h>
#include <assert.h>

// User-extended exception registration record
typedef struct _regrec {
  PVOID         pNext;
  PFN           pfnHandler;
  jmp_buf       jmpWorker;
} REGREC;
typedef REGREC *PREGREC;

// ----------------------------------------------------------------------
ULONG _System Handler( PEXCEPTIONREPORTRECORD p1,
                       PREGREC p2,
                       PCONTEXTRECORD p3,
                       PVOID pv )
{
  switch( p1->ExceptionNum ) {
  case XCPT_ACCESS_VIOLATION:
  case XCPT_INTEGER_DIVIDE_BY_ZERO:
  case XCPT_INTEGER_OVERFLOW:
  case XCPT_PROCESS_TERMINATE:           // Killed thread case
  case XCPT_ASYNC_PROCESS_TERMINATE:     // Killed thread case
    // Interested in this one
    longjmp( p2->jmpWorker, p1->ExceptionNum );
  default:
    break;
  }
  // Not handled
  return XCPT_CONTINUE_SEARCH;
}


// ----------------------------------------------------------------------
// Returns TRUE for success, FALSE for failure
LONG _System WorkerFunction( PCHAR pch )
{
  LONG        rc;
  LONG        rcResult;
  ULONG       ulException;
  REGREC      regrec;

  // Set a handler
  regrec.pfnHandler = (PFN)Handler;
  rc = DosSetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
  assert( rc == 0 );

  // Store a known thread state
  ulException = setjmp( regrec.jmpWorker );

  if( ulException ) {

    // Clean up here: free memory allocations, release mutex sems, etc.

    // Get the handler off the chain
    rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
    assert( rc == 0 );

    // Check for the killed-thread case
    switch( ulException ) {
    case XCPT_PROCESS_TERMINATE:
    case XCPT_ASYNC_PROCESS_TERMINATE:
      // Clean up done above and thread really wants to die
      DosExit( EXIT_THREAD, 0 );
      break;
    }
    // Set a failing result code
    rcResult = FALSE;
    goto depart;
  }

  // Dereference the supplied pointer
  *pch = 'a';

  rc = DosUnsetExceptionHandler( (PEXCEPTIONREGISTRATIONRECORD) &regrec );
  assert( rc == 0 );

  rcResult = TRUE;

depart:
  return rcResult;
}


// ----------------------------------------------------------------------
int main ( void )
{
  CHAR     szWork[ 16 ];
  LONG     rc;

  // Try worker function with a good pointer
  rc = WorkerFunction( szWork );
  printf( "Good pointer returns %d\n", rc );

  // Try worker function with a bad pointer
  rc = WorkerFunction( NULL );
  printf( "Bad pointer returns %d\n", rc );

  return 0;
}

Sample Program 2: worker.c

Notes about Sample Program 2:

  • The Killed Thread: The code in Sample Program 2 shows how to handle the killed thread case. Even though there are no killed threads in Sample Program 2, the technique is critical to exported worker functions in DLLs where the client process may use DosKillThread with abandon.
  • Nested Exceptions: At exception time, OS/2 inserts a handler at the head of the chain before it invokes the remaining handlers on the chain in order to detect nested exceptions. (A nested exception is one that occurs in an exception handler.) The IBM C Set/2 implementation of longjmp() correctly unwinds the system's nested exception handler.
  • Sparse Allocations in OS/2: When there is no COMMIT option on the MEMMAN statement in CONFIG.SYS, OS/2 handles every memory allocation in a sparse manner similar to Sample Program 1. This technique is called lazy commit. When the COMMIT option is present on MEMMAN, commits are never deferred.

Future Considerations

Rest assured that this exception management strategy is portable to future versions of OS/2. It uses 32-bit APIs, ANSI C runtime routines, and no assembler code.


Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation