An Accurate Software Delay for OS/2 Device Drivers

by Frank J. Schroeder and Allen Wynn

If you watch the personal computer market, you know that computer manufacturers are always coming out with faster computers. Processor speed seems to double every 12 to 18 months, and there is no sign of this trend letting up. Although computer users eagerly await the next leap in computing speed, device driver developers must ensure that faster instruction completion does not cause problems for their device drivers.

Back in the days of the IBM Personal Computer AT, programmers realized the processor could execute back-to-back In and Out (I/O) instructions to the same port faster than I/O chips could handle them. The I/O chips need time (500 nanoseconds) to settle between consecutive accesses. The Technical Reference for the IBM PC/AT recommends placing a jmp short $+2 instruction between I/O instructions when the same I/O chip is accessed.

As time went on, hardware engineers looked for ways to improve execution efficiency of the processor. Two notable improvements were to increase the chip speed (MHz) and to decrease the number of clocks required for many instructions. Intel also optimized its processors in regard to the prefetch queue. If an unconditional jump could be handled in the prefetch queue, the processor would increment the instruction pointer rather than execute the instruction. Although this optimization (and countless others) dramatically improved execution time, it caused programmers to look for other methods to generate a consistent 500-nanosecond delay.

OS/2 provides a processor-speed independent function that device drivers can use to generate accurate software delays rather than having each device driver create its own delay mechanism.

IODelay Macro, an Early Solution
In OS/2, the jmp $+2 instruction exists in a macro called iodelay. Each device driver invokes iodelay where required, and the macro inserts into the code the jmp $+2 instruction. Sample Code 1 illustrates the use of the iodelay macro: ''Sample Code 1. Code example using iodelay macro''

Although the iodelay macro met the needs of early OS/2 developers, it was clear that another solution was needed to generate accurate delays as processor speed increased.

Hardware/Software Timers
The most obvious solution to the problem of generating a processor-speed independent delay is hardware timers. Under OS/2's multitasking environment, it is not feasible to have more than one device driver manipulate the hardware timers. No device driver would be sure whether the hardware timers are already in use by another device driver.

Software timers are also inadequate. The DevHlp_SetTimer function has a granularity of one clock-tick (32 milliseconds), which is much larger than the 500-nanosecond delay needed. The timer handler also executes at interrupt time, not task time, so additional problems can occur.

DevIODelay Macro
In an effort to create the desired delay on a faster Intel processor system, it was decided to create a new macro called DevIODelay and to change the device drivers to use this new macro.

To make the DevIODelay macro more usable, a parameter can be passed to the macro which specifies the register to use. If no register is specified, then the AX register is used. There are two incantations of the macro: one for 16-bit code, called DevIODelay, and another for 32-bit code, called DevIODelay32. The OS/2 device driver model is a 16-bit model and the examples in this article use the 16-bit version of DevIODelay. The example in Sample Code 2 passes the BX register to DevIODelay: ''Sample Code 2. Code example using DevIODelay macro''

The new macro changed the jmp $+2 instruction to the code specified in Sample Code 3. A delay count (DOSIODELAYCNT) is moved into a general-purpose register; the value in the register is decremented until the value becomes zero. The benefit of using these few instructions is that the code generates the desired delay and is not optimized in the prefetch queue of the Intel processor. ''Sample Code 3. Code inserted by DevIODelay macro''

The DevIODelay macro is available in the \DDK\INC\IODELAY.INC include file on the Developer Connection Device Driver Kit for OS/2 CD-ROM.

The real trick to the DevIODelay macro is determining what value to specify for the delay count on a particular processor. The goal was to determine this value once rather than require that each device driver calculate it. Early in the OS/2 boot process, the system, using the clock chip, calculates the elapsed time required to execute the loop on the current processor. The system then calculates the appropriate delay count to generate a 500-nanosecond delay.

Module Definition File Changes
The next obstacle that needed to be overcome was how to make the delay count value available to device drivers. The best alternative was to have the loader resolve the DOSIODELAYCNT reference at load time.

The linker is informed about DOSIODELAYCNT with the IMPORT instruction in the module definition file (see Sample Code 4). The linker places fixup, or relocation, information about DOSIODELAYCNT in the fixup section of the device driver. When the device driver is loaded, the loader resolves the fixups and places the correct delay count value inside the code inserted by the DevIODelay macro. ''Sample Code 4. Required import statement for linker module definition file''

Summary
With the information in this article, you can begin to use this software delay mechanism. The software delay mechanism within OS/2 relieves you from having to develop your own mechanism and standardizes on one processor-speed independent software delay mechanism.