Suffering From Memory Leaks?

From EDM2
Jump to: navigation, search

by Curt Finch

Do you have data corruption bugs in your code right now? Do you have a situation where you are responsible for maintaining 500,000 lines of someone else's code and it is core dumping at random intervals? Are you responsible for mission-critical software where unintended results, segmentation violations, and memory leaks are absolutely unacceptable?

If you answered "yes" to any of these questions, then ZeroFault for AIX is the tool for you. ZeroFault finds bugs quickly, painlessly, and automatically. Created by The Kernel Group Incorporated, ZeroFault is a virtual machine tool for discovering and analyzing errors and potential errors in executable software programs.

The goal of ZeroFault is to identify memory problems when and where they occur and provide the information necessary to resolve them. For example, ZeroFault finds these errors:

  • Reading from uninitialized memory
  • Reading from invalid memory addresses
  • Reading from memory locations that have been freed
  • Freeing a block of memory multiple times
  • Writing into invalid memory locations
  • Passing invalid memory locations to system functions

ZeroFault provides the software developer with the information necessary to prevent memory related problems from reaching the customer. ZeroFault also makes it easier to diagnose and fix problems even in customer environments. Besides being useful for developers, ZeroFault makes it possible for the software consumer to test and evaluate the quality of software purchases in the environment in which they will be used.

Because ZeroFault does not relink the application, it is possible for ZeroFault to run even on stripped executables. This ability allows you to analyze the exact software that the customer is running. If the executable is stripped, line numbers of where the error occurred are not available, but all errors are still detected and reported.

Usage

Suppose one of your users runs into a core dump in a program you are responsible for maintaining called myprogram; ZeroFault runs without preparation of any sort. The normal method for using ZeroFault is as follows:

$ zf myprogram myarg1 myarg2

This command runs myprogram within the ZeroFault virtual machine. Provided that your DISPLAY is set, the Myprogram GUI (if it has one) will be displayed, as well as a ZeroFault window showing the errors detected in the running myprogram code. You may use the myprogram interface just as you normally would.

If an output file is specified with the -f option, or if the DISPLAY environment variable is unset, zf will send its output to a file, by default ZF OUT This file can be viewed in ASCII form with the zf_rpt command, or with the default GUI, zf_ui. The -c option causes zf to create a new user interface to view the errors of each forked child instead of sending the output to a file.

Here are some examples of memory errors:

Example 1

A Bad Function Call Parameter (BFCP) error indicates that the memory region that a function call or system call is going to take its data from is not entirely available to the process. Reasons for this error include referencing a region that is too small, referencing a free() ed region, or referencing a region that was never allocated Given this code.

main()
{
 char *source = (char *)strdup( hi );
 char target[20];
 memcpy(target,source,sizeof(target));
 printf( %s n ,target);
}

The memory routine is reading 20 bytes from ‘source when only 3 bytes were allocated to it; a BFCP error is generated.

Example 2

Double free memory errors occur when the free() routine is called twice. These errors can make your code core dump or misbehave. Here is a simple example:

#include <stdio h>

main()
{
 char *x;
 printf( Calling free twice with same address r n );
 x = (char *) malloc(100);
 free(x); /* free the memory once */
 free(x); /* free the memory a second time */
}

In this example, ZeroFault will show you three tracebacks one for where the error occurred (the second free), one for where the allocation occurred (the malloc), and one for the first free. Data on both frees are provided because either free could be the error. The programmer must determine which free is the problem.

The ZeroFault display contains a list of all errors discovered by ZeroFault. The list is updated as the errors occur Facilities are available for:

  • Suppressing non-critical errors
  • Sorting the error messages
  • Viewing memory leaks
  • Viewing and editing source code associated with errors
  • Viewing each message at several levels of detail

ZeroFault runs on any program compiled from any language, stripped or not, symbols or not, pthreads, custom threads packages, and DCE. It works on AIX versions 3.2.5, 4.1, and 4.2. The only requirement is that the target application must be a POWER- or PowerPC-compliant XCOFF executable.

Advanced Use

Suppose you know that a linked list is getting corrupted somewhere (in myprogram again) and you want to find the culprit You would like to use watchpoints in gdb or dbx, but they prove to be unusably slow. ZeroFault supports fast watchpoints. Fast watchpoints provide the ability to detect whenever the target application reads from or writes to a region of memory ZeroFault's fast watchpoints run with no additional performance degradation.

ZeroFault provides a powerful mechanism for describing the areas of memory that are to be watched It also provides mechanisms for filtering legal references.

Example

Assume that a block of memory is allocated by a function list_alloc() and should only be modified by one of a few functions list_alloc(), list_add(), list_delete(), and list_look() Further assume that somehow that particular block of memory is being corrupted or modified in some way. The following text (which must be entered on one line) must be added to the zfrc file, which is located in the user s home directory. The zfrc file is an initilization file that allows the control of the more advanced versions of ZeroFault, such as high-speed watchpoints

watchpoint list_alloc() when not below list_add|list_alloc|list_delete|list_look

or

watchpoint list_alloc() when not below list_*

This command causes ZeroFault to report to the user whenever any block that is allocated in list_alloc() is modified or read by any function that is not a child of a function beginning with the string list_

The extended regular expression filtration provides the ability to define a legal set of functions that may modify or read memory allocated at a particular location and report whenever any other nonlegal function accesses that memory.

ZeroFault for AIX is included in the Development Tools category of the Volume 11 DevCon for AIX CD. It is a 30-day timed-out version with full HTML documentation, viewable with any Web browser. You can also learn more about the product on the World Wide Web at http://www.tkg.com. If you have problems evaluating ZeroFault or if you have suggestions for improvements, support is available by electronic mail at zf.tkg.com.

Summary of Errors Reported by ZeroFault for AIX

bad memory read (write)
The process read from or wrote to memory not allocated to the process This error means that the memory was freed, unallocated, beyond the block boundary, or never allocated
bad function call parameter (target)
The process made a function or system call that reads from (parameter) or writes to (target) memory not allocated to the process. This error means that the memory was freed, unallocated, beyond the block boundary, or never allocated.
null pointer read (write)
The process read from or wrote to a null pointer (or a near null pointer).
overwrite of stack pointer
The process overwrote the stack pointer.
read (write) below stack pointer
The process read from or wrote to memory that belongs to an unwound stack frame. This error is normally associated with a function returning a pointer to automatic data.
realloc of free addr
The process attempted to reallocate a memory block that is freed.
bad realloc
The process attempted to reallocate an address that does not correspond to a memory block.
duplicate free
The process tried to free an already freed block.
bad free
The process tried to free an address that was never allocated.
leak block
This error is reported when garbage collection is done, or when the process exits. It refers to a block that cannot be referenced because no pointers exist in the process that point to or into the block.
unfreed block
This error is reported when the process exits. It refers to a block of memory that has not been freed. Because large numbers of these blocks often exist, this error is only reported when the -u flag is used.
uninitialized system call parameter
Indicates a memory region that was passed as a source to a system call was not initialized.
uninitialized stack read
Indicates a stack variable was used before it was initialized.
uninitialized memory read
Indicates a region in a malloc ed block (heap memory) was used before it was initialized.
watch point read (write)
The process read or wrote from a region marked as a watch point.

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation