Jump to content

An Accurate Software Delay for OS/2 Device Drivers: Difference between revisions

From EDM2
No edit summary
Ak120 (talk | contribs)
mNo edit summary
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
by [[Frank J. Schroeder]] and [[Allen Wynn]]
''by [[Frank J. Schroeder]] and [[Allen Wynn]]''


If you watch the personal computer market, you know that computer manufacturers are always coming out with faster computers. Processor speed seems to double every 12 to 18 months, and there is no sign of this trend letting up. Although computer users eagerly await the next leap in computing speed, device driver developers must ensure that faster instruction completion does not cause problems for their device drivers.
If you watch the personal computer market, you know that computer manufacturers are always coming out with faster computers. Processor speed seems to double every 12 to 18 months, and there is no sign of this trend letting up. Although computer users eagerly await the next leap in computing speed, device driver developers must ensure that faster instruction completion does not cause problems for their device drivers.
Line 10: Line 10:


==IODelay Macro, an Early Solution==
==IODelay Macro, an Early Solution==
 
In OS/2, the jmp $+2 instruction exists in a macro called iodelay. Each device driver invokes iodelay where required, and the macro inserts into the code the jmp $+2 instruction. ''Sample Code 1'' illustrates the use of the iodelay macro:
In OS/2, the jmp $+2 instruction exists in a macro called iodelay. Each device driver invokes iodelay where required, and the macro inserts into the code the jmp $+2 instruction. Sample Code 1 illustrates the use of the iodelay macro:
<code>
 
  INCLUDE iodelay.inc            ; IODELAY MACROS
  INCLUDE iodelay.inc            ; IODELAY MACROS
   
   
Line 20: Line 19:
   mov    al,STROBEHIGH          ; GET STROBE HIGH BIT
   mov    al,STROBEHIGH          ; GET STROBE HIGH BIT
   out    dx,al                  ; SET THE STROBE HIGH
   out    dx,al                  ; SET THE STROBE HIGH
</code>
''Sample Code 1. Code example using iodelay macro''


 
Although the iodelay macro met the needs of early OS/2 developers, it was clear that another solution was needed to generate accurate delays as processor speed increased.
Sample Code 1. Code example using iodelay macro
 
Although the iodelay macro met the needs of early OS/2 developers, it was clear that another solution was needed to generate accurate delays as processor speed increased.  


==Hardware/Software Timers==
==Hardware/Software Timers==
The most obvious solution to the problem of generating a processor-speed independent delay is hardware timers. Under OS/2's multitasking environment, it is not feasible to have more than one device driver manipulate the hardware timers. No device driver would be sure whether the hardware timers are already in use by another device driver.
The most obvious solution to the problem of generating a processor-speed independent delay is hardware timers. Under OS/2's multitasking environment, it is not feasible to have more than one device driver manipulate the hardware timers. No device driver would be sure whether the hardware timers are already in use by another device driver.


Software timers are also inadequate. The DevHlp_SetTimer function has a granularity of one clock-tick (32 milliseconds), which is much larger than the 500-nanosecond delay needed. The timer handler also executes at interrupt time, not task time, so additional problems can occur.  
Software timers are also inadequate. The DevHlp_SetTimer function has a granularity of one clock-tick (32 milliseconds), which is much larger than the 500-nanosecond delay needed. The timer handler also executes at interrupt time, not task time, so additional problems can occur.


==DevIODelay Macro==
==DevIODelay Macro==
In an effort to create the desired delay on a faster Intel processor system, it was decided to create a new macro called DevIODelay and to change the device drivers to use this new macro.
In an effort to create the desired delay on a faster Intel processor system, it was decided to create a new macro called DevIODelay and to change the device drivers to use this new macro.


To make the DevIODelay macro more usable, a parameter can be passed to the macro which specifies the register to use. If no register is specified, then the AX register is used. There are two incantations of the macro: one for 16-bit code, called DevIODelay, and another for 32-bit code, called DevIODelay32. The OS/2 device driver model is a 16-bit model and the examples in this article use the 16-bit version of DevIODelay. The example in Sample Code 2 passes the BX register to DevIODelay:
To make the DevIODelay macro more usable, a parameter can be passed to the macro which specifies the register to use. If no register is specified, then the AX register is used. There are two incantations of the macro: one for 16-bit code, called DevIODelay, and another for 32-bit code, called DevIODelay32. The OS/2 device driver model is a 16-bit model and the examples in this article use the 16-bit version of DevIODelay. The example in ''Sample Code 2'' passes the BX register to DevIODelay:
 
<code>
  INCLUDE iodelay.inc              ; IODELAY MACROS
  INCLUDE iodelay.inc              ; IODELAY MACROS
   
   
Line 45: Line 41:
   mov        al,STROBEHIGH      ; GET STROBE HIGH BIT
   mov        al,STROBEHIGH      ; GET STROBE HIGH BIT
   out        dx,al              ; SET THE STROBE HIGH
   out        dx,al              ; SET THE STROBE HIGH
</code>
''Sample Code 2. Code example using DevIODelay macro''


Sample Code 2. Code example using DevIODelay macro
The new macro changed the jmp $+2 instruction to the code specified in ''Sample Code 3''. A delay count (DOSIODELAYCNT) is moved into a general-purpose register; the value in the register is decremented until the value becomes zero. The benefit of using these few instructions is that the code generates the desired delay and is not optimized in the prefetch queue of the Intel processor.
 
<code>
The new macro changed the jmp $+2 instruction to the code specified in Sample Code 3. A delay count (DOSIODELAYCNT) is moved into a general-purpose register; the value in the register is decremented until the value becomes zero. The benefit of using these few instructions is that the code generates the desired delay and is not optimized in the prefetch queue of the Intel processor.
  EXTRN  DOSIODELAYCNT:ABS
 
   
  EXTRN  DOSIODELAYCNT:ABS   
 
  mov    ax, DOSIODELAYCNT
  mov    ax, DOSIODELAYCNT
  top:
  top:
  dec    ax
  dec    ax
  jnz    top
  jnz    top
 
</code>
 
''Sample Code 3. Code inserted by DevIODelay macro''
Sample Code 3. Code inserted by DevIODelay macro


The DevIODelay macro is available in the \DDK\INC\IODELAY.INC include file on the Developer Connection Device Driver Kit for OS/2 CD-ROM.
The DevIODelay macro is available in the \DDK\INC\IODELAY.INC include file on the Developer Connection Device Driver Kit for OS/2 CD-ROM.


The real trick to the DevIODelay macro is determining what value to specify for the delay count on a particular processor. The goal was to determine this value once rather than require that each device driver calculate it. Early in the OS/2 boot process, the system, using the clock chip, calculates the elapsed time required to execute the loop on the current processor. The system then calculates the appropriate delay count to generate a 500-nanosecond delay.  
The real trick to the DevIODelay macro is determining what value to specify for the delay count on a particular processor. The goal was to determine this value once rather than require that each device driver calculate it. Early in the OS/2 boot process, the system, using the clock chip, calculates the elapsed time required to execute the loop on the current processor. The system then calculates the appropriate delay count to generate a 500-nanosecond delay.


==Module Definition File Changes==
==Module Definition File Changes==
The next obstacle that needed to be overcome was how to make the delay count value available to device drivers. The best alternative was to have the loader resolve the DOSIODELAYCNT reference at load time.
The next obstacle that needed to be overcome was how to make the delay count value available to device drivers. The best alternative was to have the loader resolve the DOSIODELAYCNT reference at load time.


The linker is informed about DOSIODELAYCNT with the IMPORT instruction in the module definition file (see Sample Code 4). The linker places fixup, or relocation, information about DOSIODELAYCNT in the fixup section of the device driver. When the device driver is loaded, the loader resolves the fixups and places the correct delay count value inside the code inserted by the DevIODelay macro.
The linker is informed about DOSIODELAYCNT with the IMPORT instruction in the module definition file (see Sample Code 4). The linker places fixup, or relocation, information about DOSIODELAYCNT in the fixup section of the device driver. When the device driver is loaded, the loader resolves the fixups and places the correct delay count value inside the code inserted by the DevIODelay macro.
 
<code>
  LIBRARY DRIVER1
  LIBRARY DRIVER1
  PROTMODE
  PROTMODE
Line 79: Line 73:
  IMPORTS
  IMPORTS
     DOSIODELAYCNT = DOSCALLS.427
     DOSIODELAYCNT = DOSCALLS.427
 
</code>
Sample Code 4. Required import statement for linker module definition file  
''Sample Code 4. Required import statement for linker module definition file''


==Summary==
==Summary==
 
With the information in this article, you can begin to use this software delay mechanism. The software delay mechanism within OS/2 relieves you from having to develop your own mechanism and standardizes on one processor-speed independent software delay mechanism.
With the information in this article, you can begin to use this software delay mechanism. The software delay mechanism within OS/2 relieves you from having to develop your own mechanism and standardizes on one processor-speed independent software delay mechanism.  


==References==
==References==
* [[IBM Personal Computer AT Technical Reference]]
* IBM Personal System/2 Hardware Interface Technical Reference


* IBM Personal Computer AT Technical Reference
{{IBM-Reprint}}
* IBM Personal System/2 Hardware Interface Technical Reference
[[Category:Developer Connection News Volume 8]]
 
'''Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation'''
 
[[Category:Driver Articles]]

Latest revision as of 23:58, 9 November 2022

by Frank J. Schroeder and Allen Wynn

If you watch the personal computer market, you know that computer manufacturers are always coming out with faster computers. Processor speed seems to double every 12 to 18 months, and there is no sign of this trend letting up. Although computer users eagerly await the next leap in computing speed, device driver developers must ensure that faster instruction completion does not cause problems for their device drivers.

Back in the days of the IBM Personal Computer AT, programmers realized the processor could execute back-to-back In and Out (I/O) instructions to the same port faster than I/O chips could handle them. The I/O chips need time (500 nanoseconds) to settle between consecutive accesses. The Technical Reference for the IBM PC/AT recommends placing a jmp short $+2 instruction between I/O instructions when the same I/O chip is accessed.

As time went on, hardware engineers looked for ways to improve execution efficiency of the processor. Two notable improvements were to increase the chip speed (MHz) and to decrease the number of clocks required for many instructions. Intel also optimized its processors in regard to the prefetch queue. If an unconditional jump could be handled in the prefetch queue, the processor would increment the instruction pointer rather than execute the instruction. Although this optimization (and countless others) dramatically improved execution time, it caused programmers to look for other methods to generate a consistent 500-nanosecond delay.

OS/2 provides a processor-speed independent function that device drivers can use to generate accurate software delays rather than having each device driver create its own delay mechanism.

IODelay Macro, an Early Solution

In OS/2, the jmp $+2 instruction exists in a macro called iodelay. Each device driver invokes iodelay where required, and the macro inserts into the code the jmp $+2 instruction. Sample Code 1 illustrates the use of the iodelay macro:

INCLUDE iodelay.inc             ; IODELAY MACROS

 mov     al,STROBELOW           ; GET STROBE LOW BIT
 out     dx,al                  ; SET THE STROBE LOW
 iodelay                        ; I/O DELAY - 500ns
 mov     al,STROBEHIGH          ; GET STROBE HIGH BIT
 out     dx,al                  ; SET THE STROBE HIGH

Sample Code 1. Code example using iodelay macro

Although the iodelay macro met the needs of early OS/2 developers, it was clear that another solution was needed to generate accurate delays as processor speed increased.

Hardware/Software Timers

The most obvious solution to the problem of generating a processor-speed independent delay is hardware timers. Under OS/2's multitasking environment, it is not feasible to have more than one device driver manipulate the hardware timers. No device driver would be sure whether the hardware timers are already in use by another device driver.

Software timers are also inadequate. The DevHlp_SetTimer function has a granularity of one clock-tick (32 milliseconds), which is much larger than the 500-nanosecond delay needed. The timer handler also executes at interrupt time, not task time, so additional problems can occur.

DevIODelay Macro

In an effort to create the desired delay on a faster Intel processor system, it was decided to create a new macro called DevIODelay and to change the device drivers to use this new macro.

To make the DevIODelay macro more usable, a parameter can be passed to the macro which specifies the register to use. If no register is specified, then the AX register is used. There are two incantations of the macro: one for 16-bit code, called DevIODelay, and another for 32-bit code, called DevIODelay32. The OS/2 device driver model is a 16-bit model and the examples in this article use the 16-bit version of DevIODelay. The example in Sample Code 2 passes the BX register to DevIODelay:

INCLUDE iodelay.inc              ; IODELAY MACROS

 mov         al,STROBELOW        ; GET STROBE LOW BIT
 out         dx,al               ; SET THE STROBE LOW
 DevIODelay  <bx>                ; I/O DELAY - 500ns
 mov         al,STROBEHIGH       ; GET STROBE HIGH BIT
 out         dx,al               ; SET THE STROBE HIGH

Sample Code 2. Code example using DevIODelay macro

The new macro changed the jmp $+2 instruction to the code specified in Sample Code 3. A delay count (DOSIODELAYCNT) is moved into a general-purpose register; the value in the register is decremented until the value becomes zero. The benefit of using these few instructions is that the code generates the desired delay and is not optimized in the prefetch queue of the Intel processor.

EXTRN   DOSIODELAYCNT:ABS

mov     ax, DOSIODELAYCNT
top:
dec     ax
jnz     top

Sample Code 3. Code inserted by DevIODelay macro

The DevIODelay macro is available in the \DDK\INC\IODELAY.INC include file on the Developer Connection Device Driver Kit for OS/2 CD-ROM.

The real trick to the DevIODelay macro is determining what value to specify for the delay count on a particular processor. The goal was to determine this value once rather than require that each device driver calculate it. Early in the OS/2 boot process, the system, using the clock chip, calculates the elapsed time required to execute the loop on the current processor. The system then calculates the appropriate delay count to generate a 500-nanosecond delay.

Module Definition File Changes

The next obstacle that needed to be overcome was how to make the delay count value available to device drivers. The best alternative was to have the loader resolve the DOSIODELAYCNT reference at load time.

The linker is informed about DOSIODELAYCNT with the IMPORT instruction in the module definition file (see Sample Code 4). The linker places fixup, or relocation, information about DOSIODELAYCNT in the fixup section of the device driver. When the device driver is loaded, the loader resolves the fixups and places the correct delay count value inside the code inserted by the DevIODelay macro.

LIBRARY DRIVER1
PROTMODE
SEGMENTS
    DSEG     CLASS 'DATA'
    CSEG     CLASS 'CODE'
    SWAPDATA CLASS 'DATA' IOPL
    SWAPCODE CLASS 'CODE' IOPL
IMPORTS
    DOSIODELAYCNT = DOSCALLS.427

Sample Code 4. Required import statement for linker module definition file

Summary

With the information in this article, you can begin to use this software delay mechanism. The software delay mechanism within OS/2 relieves you from having to develop your own mechanism and standardizes on one processor-speed independent software delay mechanism.

References

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation