32-Bit Memory Management in OS/2

by Monte Copeland

Memory management in OS/2 has many faces: 32-bit, 16-bit, and real mode. This article discusses programming the 32-bit memory model of OS/2. It covers memory addressing, allocating, and heap management, as well as memory leaks and how to debug them. The DevCon for OS/2 CD-ROMs contain sample C code to illustrate these memory management concepts.

Process Address Space
When a process starts, OS/2 prepares a virtualized array of RAM called the process address space. Within this space, OS/2 allocates memory for .EXE and .DLL code and data. A program accesses this space with a 32-bits-wide address. The smallest address is usually 64KB, which is the base load address for .EXEs. The largest address is 512MB, the limit of OS/2 virtual address space (see Figure 1).

Private memory resides at low addresses. Only the owning process can access this memory. Private allocations start low and increase upwards. On the other hand, shared memory is allocated high and works downward.

OS/2 divides memory into pages that are 4KB in size. Each process has a set of page tables that maps its virtual memory to physical RAM. Each 4KB page has attributes including read/write, read-only, private, shared, committed, and guard.

Stack Memory
It's a good idea to create threads with at least a 32KB stack; OS/2 only uses what it needs to run the thread. Here's how:

When OS/2 allocates memory for a thread stack, it commits the top page and sets the guard attribute on the page below it. If stack usage exceeds 4KB, a guard-page exception occurs. OS/2 handles this exception: it commits the guard page and sets the next lower page to guard. Using this scheme, OS/2 commits only the pages a thread really needs.

A trap can occur using automatic (stack) variables larger than 4KB. For example, assume that an 8KB array spans a guard page. If the program writes element zero first, the program will trap because it skipped guard page processing. Some compilers, including IBM's, generate code to touch each page in large automatic variables. For guard page processing to work, the code must touch pages starting at high addresses and work down.

VisualAge C++ Compiler Data Pragma
DosAllocMem and DosAllocSharedMem are not the only ways to allocate memory. The compiler, linker, and loader do a great deal to allocate memory and initialize it. The creator of a .DLL must decide where data belongs: shared versus private memory. The VisualAge C++ compiler has a data_seg pragma that takes a single argument - the name of a memory segment defined in the module definitions (.DEF) file.

In Sample Code 1, the compiler ensures that the char array szBuffer resides in a memory segment named PIECE_1. char szBuffer[ 256 ]; ''Sample Code 1. Excerpt from a C program that uses the data_seg pragma to place the static variable szBuffer onto memory segment PIECE_1.''
 * 1) pragma data_seg( PIECE_1 )

In the .DEF file, the segment PIECE_1 is a shared segment as follows: SEGMENTS PIECE_1 CLASS 'DATA' SHARED PIECE_2 CLASS 'DATA' NONSHARED ''Sample Code 2. Excerpt from a .DEF file that makes the PIECE_1 segment a shared memory object.''

Programmers coding a .DLL use data_seg pragmas and .DEF file SEGMENTS statements to control which variables are private per process and which are shared. Reference data and read-only data usually need a single copy in memory, so you should place these in shared memory.

This technique is not limited to .DLLs. Programmers who expect to have multiple copies of the same .EXE running simultaneously can do this, too. See \SOURCE\DEVNEWS\VOL10\MEM\PRAG.C on disc 1 of your accompanying DevCon for OS/2 CD-ROMs for more detail on implementing this approach.

Managing Contention for Shared Memory
Whenever multiple processes read and write shared memory, you must manage the contention. This is best done with a named (therefore, shared) mutual exclusion semaphore. Under OS/2, anonymous semaphores are not suited for this task. Here's why:

Assume you placed an anonymous semaphore handle onto shared memory. The first time your .EXE or .DLL loads, it creates the mutex semaphore. The thread tests the semaphore handle; if zero, it calls DosCreateMutexSem. But this is flawed logic! The test for a null semaphore handle is itself a reference to shared memory and must be protected by a semaphore. This logic works most of the time, but can fail in a race condition!

Named mutex semaphores don't have this problem. See \SOURCE\DEVNEWS\VOL10\MEM\TESTER.C on your DevCon for OS/2 CD-ROMs for more detail on this implementation.

Heaps for Small Allocations
The DosAllocMem API rounds up the allocation size to the nearest page boundary. For example, DosAllocMem rounds up a 100-byte allocation to 4096 bytes. Thus, DosAllocMem is not the right choice for many small allocations.

Small allocations require a heap. If you are programming in C, use the heap manager provided with your compiler (for example, new, delete, malloc, strdup, free).

OS/2 also has a heap manager in the DosSubAllocMem and DosSubFreeMem suballocation APIs (see Sample Code 3). define LEN_HEAP 0x20000 PVOID pvHeap; APIRET rc; rc = DosAllocMem( &pvHeap, LEN_HEAP, PAG_WRITE ); assert( rc == 0 ); rc = DosSubSetMem( pvHeap, DOSSUB_INIT | DOSSUB_SPARSE_OBJ, LEN_HEAP ); assert( rc == 0 ); ''Sample Code 3. Code to prepare a suballocated heap. It is a sparse heap; OS/2 will commit pages as needed. For best results, subset the entire allocation and avoid the "grow" option. See the source code located in \SOURCE\DEVNEWS\VOL10\MEM\HEAP.C on disc 1 of your DevCon for OS/2 CD-ROMs.''

Programmers often put function wrappers around DosSubAllocMem and DosSubFreeMem for convenience. The following allocation wrapper allocates a little extra space in order to store the heap base pointer and the suballocation size: PVOID APIENTRY myalloc( PVOID pvBase, ULONG ulSize );

It returns a pointer to the suballocated memory. The following free wrapper uses pv to retrieve the base pointer and size, then it calls DosSubFreeMem: PVOID APIENTRY myfree( PVOID pv ); See \SOURCE\DEVNEWS\VOL10\MEM\HEAP.C on disc 1 of your DevCon for OS/2 CD-ROMs for sample code.

Out of Memory
If a process runs out of memory, the problem is usually due to a memory leak or disk full condition.

If SWAPPER.DAT grows until it fills the disk, then requests for committed, read/write memory have exceeded disk capacity. First, point the swapper to a larger disk. If it fails again, there is probably a memory leak.

A memory leak is a program error - a program allocates memory and fails to free it. A program that leaks memory is like a time bomb: it's a matter of time before the program will fail.

The productivity tool 20MEMU, which you can install from the "Productivity Tools" category in the Developer Connection for OS/2 catalog, reports on memory usage. 20MEMU helps to detect and debug memory leaks.

The first panel reports memory usage for the system. To report on a certain process, enter the process ID and press Enter. The program reports on process private memory, process shared memory, and operating system shared memory.

To detect a leak, take "snapshots" of memory usage at regular intervals. If the list of memory objects grows and never shrinks, there is a leak. Use the virtual addresses and your debugger to track it down.

Lock-Proof Memory
An OS/2 physical device driver (PDD) will "lock down" memory during I/O, so it won't be paged to disk. Some drivers have problems locking memory buffers allocated from heaps. The write fails and returns result code 5. The solution is to allocate and commit a memory buffer using DosAllocMem. Use this buffer to pass data to the PDD.

Page Tuning
Page tuning is the act of identifying functions with high interaction, then placing those functions near each other in memory. This reduces the working set; fewer pages are needed to perform a task, resulting in less paging and better performance.

Placing a function in memory requires the help of your compiler. The VisualAge C++ compiler supports the pragma alloc_text. In the following example, the compiler places function _DLL_InitTerm in the CODE1 code segment: An .H file included by all C sources is a good place to code alloc_text pragmas. Manual page tuning is possible, but requires great familiarity with the code. Profiler tools automate the process because they provide graphic representations of execution as well as working set page counts.
 * 1) pragma alloc_text( CODE1, _DLL_InitTerm )