DDDR/2 - S3 Display Driver

From EDM2
(Redirected from S3 Display Driver)
Jump to: navigation, search
Display Device Driver Reference
Chapters
  1. 16-Bit VGA Display Driver
  2. 8514/A Display Driver
  3. 32-Bit VGA Display Driver
  4. 32-Bit Super VGA Display Driver
  5. SVGA Base Video Subsystem
  6. Physical Video Device Drivers
  7. Virtual Video Device Drivers
  8. Seamless Windows Support
  9. PM Palette Management Support
  10. Distributed Console Access Facility (DCAF)
  11. DBCS Video Driver Support
  12. Installing and Configuring Display Device Drivers
  13. Graphics Test Suites
  14. Display Test Tool
  15. VIDEOCFG.DLL Exported Functions
  16. VIDEOPMI.DLL Exported Functions
  17. VIDEO Protect-Mode Interface
Appendixes
  1. Data Types
  2. S3 Display Driver

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation

This chapter explains the architecture of the S3 display device driver. Because the S3 driver was originally developed from the XGA and 8514 driver source code, their features will be explained. The purpose of this document is for you to have an understanding of the portions of the driver that must be modified to support other graphics-accelerator chip sets.

Contents

Overview

The S3 display device driver for OS/2 2.1 (and later) consists of approximately 5.8MB of C-language and assembler-language source code spread over approximately 263 files.

Use the S3 display driver as a starting point when developing a presentation display driver for a graphics chip. The S3 driver is written in a combination of C and assembler language with almost half of the source files coded in C. The C-language modules control driver initialization, operating system-dependent functions, and the device-independent setup required by most presentation display driver entry points. Device-dependent and high-performance code are controlled by the assembler-language modules.

The S3 driver was specifically designed for fixed-function video-accelerator chips. It also takes advantage of features such as hardware-assisted bitblt, hardware-line drawing, MMPM/2, fast-monochrome data expansion, bit-map caching and font caching.

The S3 driver offers a number of features not offered by the SVGA driver. It supports multiple resolutions and color depths in a single driver. Further, it supports static-mode change, which makes changing resolution and color depth as easy as changing the desktop color scheme.

Most of the hardware-dependent code is isolated into a handful of assembler-language files. These files have straightforward functions that can be implemented on most accelerator chips. Higher level functions that are also hardware-dependent, such as bit map and font caching, are written in C.

The XGA Driver

The S3 driver was derived from the 32-bit 8514/A presentation display driver for OS/2. This driver, was in turn, derived from the 32-bit XGA presentation display driver. Because the design of the driver was strongly influenced by the features offered by the XGA, and to some extent, the 8514/A, it is necessary to understand certain aspects of the XGA and 8514/A presentation display drivers.

XGA Pixmaps

The pixmap is a fundamental element of the XGA presentation display driver and the XGA hardware. The pixmap is a description of the height, width, and color depth of a bit map in either system memory or video memory. (The XGA presentation display driver exploits bus-mastering, which allows it to operate on pixmaps in system memory or video memory. Note that in this document, VRAM refers to the memory on the video card.) The 8514 and S3 source code refer to bit maps as if they were XGA pixmaps, even though neither graphics chip set directly supports pixmaps. The following example of pixmap code appears in the EDDHBBLT.ASM file:

; select the destination map
        memregwrite     pi_map_index_A, SEL_PIX_MAP_A

; write destination bit-map address to hardware mapA base
        mov             eax, [ebx].bit map_address
        IFDEF  _8514
        IFDEF  HARD_DRAW
        and     eax,0fffffffh   ; clear the vram marker
        ENDIF
        ENDIF
        memregwrite     pi_map_base_ptr_A, eax

; write mapA width and height from bit-map header
        mov             eax, dword ptr [ebx].hw_width
        memregwrite     pi_map_size_A, eax

; write mapA format from bit-map header
        mov             al, byte ptr [ebx].hw_format
        memregwrite     pi_map_format_A, al

The memregwrite assembler macros located throughout the source code are used in the XGA display driver to write to XGA-hardware registers. Except for the section in the IFDEF_8514, this code is identical to that found in the XGA display driver. To use this driver source, an understanding of the XGA architecture is essential, even if the graphics chip set is nothing like an XGA chip set.

The following portion of the MMXGAReg structure shows the registers that define a pixmap for the XGA driver:

typedef struct _MMREG {
       .
       .
       .
volatile BYTE    pixmapIndex;
volatile BYTE    bNotUsed5;
volatile ULONG   pixmapBase;
volatile USHORT  pixmapWidth;
volatile USHORT  pixmapHeight;
volatile BYTE    pixmapFormat;
       .
       .
       .
} MMReg;

PixMapIndex selects which of the pixmaps is being defined. The XGA chipset implements four pixmaps: pixmaps A, B, C, and a mask map that must be monochrome. The mask map is primarily used for non-rectangular scissoring, and is not ordinarily used in the XGA presentation display driver. The other maps can be used as either source, destination, or pattern maps.

PixMapBase is the base address in either system or video memory of the pixmap. The XGA driver refers to objects by their address, rather than by X -Y coordinates of the object in VRAM. The XGA display driver refers to all objects by their address (even cached objects in off-screen VRAM) and is fundamental to the design of the XGA display driver. The XGA chip has a memory management unit that allows it to know if a given pixmap is in VRAM or in system memory. Everything has a virtual address, from the glyph cached in VRAM to the largest system-memory bit map. Objects in VRAM, the character cache, and the Phunk (see The Physical Chunk and Bus-Master Operations, for further information), also have physical addresses associated with them.

PixMapWidth is the actual width of the bit map in pels minus one pel. To simplify the caching of objects in off-screen VRAM, the XGA driver stores bit maps as linear arrays of bytes in off-screen video memory or system memory.

PixMapHeight is the actual height of the bit map in pels minus one pel.

PixMapFormat is the size of the pels in the pixmap, as well as a flag indicating whether the pixmap is stored in Motorola** or Intel** byte-ordering format (the XGA chip set can handle either format). Pel depths can be 1, 4, 8, 16, or 24 bits.

The displayed portion of video memory (display surface) is a pixmap to the XGA driver. Portions of it may be addressed by using the XGA driver's X-Y source or destination registers. The information the XGA driver regards as a pixmap is very similar to that contained in a presentation display bit- map header. (In fact, the previous source code example consisted of copying elements out of the bit-map header.)

Fundamental to the XGA driver's design is that it can perform BitBlts and other accelerator operations on arbitrary bit maps. Because of this design , much of the XGA driver code defines bit maps in system memory as pixmaps. The XGA driver's memory-management scheme is versatile enough to allow the bit-map drawing code in the XGA, 8514/A, and S3 drivers to interpret system -memory images of XGA registers, and determine which drawing operations to perform on bit maps.

XGA Registers

The XGA driver supports both I/O-mapped and memory-mapped registers. The I/ O-mapped XGA registers are used during the initialization of the card and during video-mode setting. The memory-mapped registers control the drawing operations of the XGA driver. Consequently, the XGA presentation display driver rarely uses an assembler out instruction. The only common driver operation that performs port I/O is the XGA driver's palette RAMDAC in 256-color mode. The XGA presentation display driver takes advantage of this by running at ring 3. Code running at ring 3 is not allowed to perform port I /O. However, applications and the OS/2 graphics engine (GRE) run at ring 3. Therefore, by running at ring 3, the XGA presentation driver avoids a ring transition on every call into the driver that actually touches the hardware. Using memory-mapped registers is faster than using port I/O, which tends to be slow in protect-mode on Intel 80x86 microprocessors.

The XGA presentation display driver does not use a 32-bit flat pointer to its memory-mapped registers. Instead, it obtains a selector that maps to its memory-mapped registers from its ring 0 driver, XGA.SYS. To find out where this process occurs, look in the EDDEFPDB.C file at the QueryAdapter() function. QueryAdapter() makes two IOCtl calls to the XGA.SYS module. The first call obtains the selector to the memory-mapped XGA registers. (Look in the XGAADAPT.INC file for the details of what is returned by these two IOCtl calls.) For other types of adapters with memory-mapped registers, it is possible to make IOCtl calls to the PMDD.SYS module and obtain a flat pointer to memory-mapped registers. The only place in the presentation driver where this should not occur is in the code that updates the cursor at interrupt level. (This code is located in the EDDCURSR.ASM file.) Having flat pointers in the cursor code does not enable you to determine in which context the cursor code will be called. It is possible that the cursor code will be used in a context in which the flat pointer to memory-mapped registers is invalid (because the cursor code is called at interrupt time). Consequently, it is convenient to use a ring 0 GDT Selector in the cursor-movement code. See Driver Initialization for more information on flat pointers to video memory.

Shadow XGA Registers

The following piece of code initializes the foreground and background colors for a particular blt. It is a portion of the XGA code from the EDDNBBLT.C file:

       /**************************************************************/
       /* set the foreground and background colors for the blt.      */
       /* These are taken from the destination attribute bundle      */
       /* unless color information was passed in the parameters      */
       /**************************************************************/

if (ArgOptions & BLTMODE_ATTRS_PRES)
{
    ShadowXGARegs.FgCol = (USHORT)LogToPhyIndex(ArgAttrs->lColor);
    ShadowXGARegs.BgCol = (USHORT)LogToPhyIndex(ArgAttrs->lBackColor);
}
else /* use attribute bundle colors */
{
    ShadowXGARegs.FgCol = (USHORT)pdc->DCIImagColatts.ForeColor;
    ShadowXGARegs.BgCol = (USHORT)pdc->DCIImagColatts.BackColor;
}

The ShadowXGARegs structure is declared as type MMREG. The definition for MMREG appears in the XGAADAPT.H file. The definition for MMREG is as follows:

typedef struct _MMReg
    {

       /**************************************************************/
       /* Fields are declared volatile to prevent the compiler that  */
       /* is generating code to write a value to the register, and   */
       /* then read it back expecting it to be the same.             */
       /* However, volatile is not implemented by c5.1.              */
       /**************************************************************/
        volatile ULONG   PageDirBaseAdd;

        volatile ULONG   CurrVirtAddr;

        volatile BYTE    bNotUsed1;
        volatile BYTE    ExtPolling;
        volatile USHORT  bNotUsed2;

        volatile BYTE    StateAlen;
        volatile BYTE    StateBlen;
        volatile USHORT  usNotUsed3;

        volatile BYTE    bNotUsed4;
        volatile BYTE    PIControl;

   /**********************************************************************/
   /* These pixmap registers will not be written to when in software     */
   /* mode                                                               */
   /**********************************************************************/

volatile BYTE pixmapIndex; volatile BYTE bNotUsed5;

        volatile ULONG   pixmapBase;

        volatile USHORT  pixmapWidth;
        volatile USHORT  pixmapHeight;

        volatile BYTE    pixmapFormat;
        volatile BYTE    abNotUsed6[3];

   /**********************************************************************/
   /* End of pixmap registers                                            */
   /**********************************************************************/


        volatile USHORT  BresErrTerm;
        volatile USHORT  usNotUsed7;

        volatile USHORT  BresK1;
        volatile USHORT  usNotUsed8;

        volatile USHORT  BresK2;
        volatile USHORT  usNotUsed9;

        volatile ULONG   DirSteps;

        volatile BYTE    bFifthStep;

/* used by software simulation only */

        volatile BYTE    abNotUsed10[23];

        volatile BYTE    FgMix;
        volatile BYTE    BgMix;
        volatile BYTE    ColCompCond;
        volatile BYTE    bNotUsed11;

        volatile ULONG   ColCompVal;

        volatile ULONG   PlaneMask;

        volatile ULONG   CarryChMask;

        volatile USHORT  FgCol;
        volatile USHORT  FgColHi;

        volatile USHORT  BgCol;
        volatile USHORT  BgColHi;

        volatile USHORT  OpDim1;
        volatile USHORT  OpDim2;

        volatile ULONG   ausNotUsed12[2];

        volatile USHORT  MaskXOffset;
        volatile USHORT  MaskYOffset;

        volatile USHORT  SrcXAddr;
        volatile USHORT  SrcYAddr;

        volatile USHORT  PatXAddr;
        volatile USHORT  PatYAddr;

        volatile SHORT   DstXAddr;
        volatile SHORT   DstYAddr;

        volatile ULONG   PixOp;

    /*********************************************************************/
    /* Here, start extra pixmap definitions, one each for A,B,C and the  */
    /* mask.  The arrangement of these is an exact image of the real     */
    /* registers. (As a result, padding was included).                   */
    /* Note that these 'shadow' registers are always in memory so        */
    /* need not be declared as volatile.                                 */
    /*********************************************************************/

        BYTE    pixmapIndexA;
        BYTE    bPaddingA;
        ULONG   pixmapBaseA;
        USHORT  pixmapWidthA;
        USHORT  pixmapHeightA;
        BYTE    pixmapFormatA;
        BYTE    bPaddingA2;

        BYTE    pixmapIndexB;
        BYTE    bPaddingB;
        ULONG   pixmapBaseB;
        USHORT  pixmapWidthB;
        USHORT  pixmapHeightB;
        BYTE    pixmapFormatB;
        BYTE    bPaddingB2;

        BYTE    pixmapIndexC;
        BYTE    bPaddingC;
        ULONG   pixmapBaseC;
        USHORT  pixmapWidthC;
        USHORT  pixmapHeightC;
        USHORT  pixmapFormatC;
        BYTE    bPaddingC2;

        BYTE    pixmapIndexM;
        BYTE    bPaddingM;
        ULONG   pixmapBaseM;
        USHORT  pixmapWidthM;
        USHORT  pixmapHeightM;
        BYTE    pixmapFormatM;
        BYTE    bPaddingM2;

    } MMReg;

The ShadowXGARegs structure is a system-memory image of values that ultimately will be written to the XGA chip set. After all of the registers needed for an operation are initialized in the structure, the driver calls the TransferShadowRegisters function, which is located in the assembler language file HWACCESS.ASM. TransferShadowRegisters copies the information from the ShadowXGARegs structure to the XGA memory-mapped registers. By polling the Control register, the original XGA graphics chip set was slowed down so that it could wait for the chip to complete a graphics operation. It is necessary to check that the XGA is not busy before writing to its registers, as the XGA has no register FIFO. To some extent, the ShadowXGARegs structure serves as a software FIFO by delaying actual hardware register writes, as long as is possible.

The ShadowXGARegs structure is the portion of the presentation display driver that draws to bit maps and emulates an XGA. Thus, all of the drawing code in the XGA driver for bit maps and the display is identical up to the point at which the actual drawing command is issued to the XGA driver's command register. (The XGA presentation display driver refers to the command register as the Pixel_Op.) The following is an example of the pixel operation from the EDDHGCHS.ASM file, the character drawing code. This example draws a character on either the XGA display surface or to a bit map in system memory.

     ;********************************************************************
     ; Before updating the hardware, determine
     ; its availability and write the destination coordinates.
     ;********************************************************************

        waitshort
        memregwrite        dest_map, eax

     ;********************************************************************
     ; Changes to the processing of bold simulation means that
     ; the foreground mix and color must be written each time.
     ;********************************************************************
ifndef _8514
       mov             al, AIxfer.bfgmix
       memregwrite     fg_mix, al
       mov             ax, AIxfer.usFColor
       memregwrite     fg_colour, ax
endif

     ;********************************************************************
     ; Source is x = left clip adjustment (currently in cx),
     ; y = cell height - 1
     ;********************************************************************

        mov     ax, ptsCharHWsize.y
        shl     eax, 16
        mov     ax, cx
        memregwrite     patt_map, eax

     ;********************************************************************
     ; Write the x dimension to the hardware.
     ;********************************************************************

memregwrite dim1, dx

     ;********************************************************************
     ; Write the character width to the hardware.
     ;********************************************************************

        swap   edx
        memregwrite       pi_map_width_C,dx

     ;********************************************************************
     ; Set up the source address.
     ;********************************************************************

        mov     eax, pGlyphImage
        memregwrite     pi_map_base_ptr_C, eax

     ;********************************************************************
     ; Set the pixel_op and kick off the blt
     ; -  background source: background color
     ; -  foreground source: foreground color
     ; -  step: PxBlt
     ; -  source pixel map:
     ; -  destination pixel map: Map A
     ; -  pattern pixel map: Map C
     ; -  mask pixel map: boundary disabled
     ; -  drawing mode:
     ; -  direction octant: left to right, bottom to top
     ;********************************************************************

         memregwrite        pixel_op,     08013002h

IFDEF HARD_DRAW
ELSE ; SOFT_DRAW
        saveregs
        call    _eddf_MESS
        restoreregs
ENDIF ; SOFT_DRAW

Soft-Draw Mode and Hard-Draw Mode

All presentation display drivers have two major but only marginally related functions: drawing to the display, and drawing to memory bit maps. This dual-mode drawing architecture was resolved by having the bit-map drawing code emulate the XGA hardware. Thus, all of the code needed to set up a drawing operation would be almost identical for the display and for bit maps. Hard-draw mode, therefore, is the mode in which the driver will write data to the XGA adapter, while Soft-draw mode writes only to system-memory bit maps.

Note:The bit map drawing code in the XGA driver emulates a subset of the XGA's hardware drawing capabilities.

In the previous example, Hard-draw mode was used. The following is an example of Soft-draw mode from the EDDNBBLT.C file.

    if (AIxfer.pbmhDest == &DirectListEntry)
    {
        SetDrawModeHard;
    }
    else
    {
        SetDrawModeSoft;
    }

This example is from the eddt_CacheCharacter() function in the EDDNGCHS.C file in the glyph-caching code.

      ShadowXGARegs.FgMix = HWMIX_SOURCE;
      ShadowXGARegs.BgMix = HWMIX_SOURCE;

      ShadowXGARegs.ColCompCond = COLCOMP_ALWAYS;

     /**************************************************************/
     /* Set up destination pixmap details.                         */
     /**************************************************************/

      ShadowXGARegs.pixmapBaseA = pVRAMCacheStart + offNextFree11Pos;
      ShadowXGARegs.pixmapWidthA  = xGlyphWidthHW;
      ShadowXGARegs.pixmapHeightA = yGlyphHeight;
      ShadowXGARegs.pixmapFormatA = ONE_BPP;

     /**************************************************************/
     /* Set up source pixmap details.                              */
     /**************************************************************/

      ShadowXGARegs.pixmapBaseB = pSysCacheStartPhy + offNextFree11Pos;
      ShadowXGARegs.pixmapWidthB  = xGlyphWidthHW;
      ShadowXGARegs.pixmapHeightB = yGlyphHeight;
      ShadowXGARegs.pixmapFormatB = ONE_BPP | MOTOROLA;

     /**************************************************************/
     /* Set up blt details - we want to copy the whole character.  */
     /**************************************************************/

      ShadowXGARegs.SrcXAddr =
      ShadowXGARegs.SrcYAddr =
      ShadowXGARegs.DstXAddr =
      ShadowXGARegs.DstYAddr = 0;

      ShadowXGARegs.OpDim1 = xGlyphWidthHW;
      ShadowXGARegs.OpDim2 = yGlyphHeight;

     /**************************************************************/
     /* Set up the pixel_op to do the blt.                         */
     /**************************************************************/

      ShadowXGARegs.PixOp = BACK_SRC_SRC_PIX_MAP |
                            FORE_SRC_SRC_PIX_MAP |
                            STEP_PXBLT |
                            SRC_PIX_MAP_B |
                            DST_PIX_MAP_A |
                            PAT_PIX_MAP_FORE |
                            MASK_PIX_MAP_OFF |
                            DRAW_MODE_DONTCARE |
                            DIR_OCTANT_LRTB;

     /**************************************************************/
     /* Now do the blt. We have to use the hardware to do this.    */
     /* Set softDrawInUse to false, and then restore it after the  */
     /* blt has been completed.  (Higher level functions may be in */
     /* software-drawing mode, but still keep the VRAM-cache       */
     /* copy up-to-date.)                                          */
     /**************************************************************/

      tempDrawMode = softDrawInUse;
      softDrawInUse = FALSE;
      TransferShadowRegisters( TSR_MAP_A |
                               TSR_MAP_B |
                               TSR_COLOUR_MIX |
                               TSR_COORDINATES |
                               TSR_PIXELOP );
      softDrawInUse = tempDrawMode;

The preceding code example copies a character from system memory into the VRAM in cache of the XGA presentation display driver. The TransferShadowRegisters function initiates an XGA bus-mastering BitBlt to copy the character.

The top-level functions in the XGA presentation display driver are written in C language. These functions are set up for a drawing operation by parsing through data structures passed in from GRE and setting up the ShadowXGARegs structure. At a certain point, they look at the address of the bit map that was passed in by GRE. If the address is the display, it uses the SetDrawModeHard macro, otherwise, SetDrawModeSoft is called. In the above example, SetDrawModeHard is executed rather than SetDrawModeSoft. The SetDrawModeHard and SetDrawModeSoft macros, which are located in the EDDMACRO.H file, are described in the following example:

#define SetDrawModeSoft                                                \
{                                                                      \
    if (!softDrawInUse)                                                \
    {                                                                  \
        pXGARegs          = (pMMReg)&ShadowXGARegs;                    \
        LinePatternCur    = LinePatternSys;                            \
        MarkerCur         = MarkerSys;                                 \
        pDrawFunctions    = (pDrawFunctionsTable)softDrawTable;        \
        softDrawInUse     = TRUE;                                      \
    }                                                                  \
}                                                                      \
                                                                       \
#define SetDrawModeHard                                                \
{                                                                      \
    if (softDrawInUse && foregroundSession)                            \
    {                                                                  \
        pXGARegs          = pRealXGARegs;                              \
        LinePatternCur    = LinePatternPhys;                           \
        MarkerCur         = MarkerPhys;                                \
        pDrawFunctions    = (pDrawFunctionsTable)hardDrawTable;        \
        softDrawInUse     = FALSE;                                     \
    }                                                                  \
}

Initially, SetDrawModeHard determines that it is currently in software-drawing mode by looking at the global flag softDrawInUse and by ensuring the presentation display driver is in the foreground. (A BitBlt operation does not occur in the XGA hardware while the user is running a DOS text-mode application in the foreground.) Next, SetDrawModeHard points pXGARegs to the selector and offset of the XGA driver's memory-mapped registers. (PXGARegs is a pointer used by various assembler-language modules.) It then sets up pointers to the line patterns and markers in XGA video memory. Finally, it sets up the pDrawFunctions table to point to the set of assembler-language functions that draw to the XGA display, and clears the softDrawInUse flag.

Very few XGA drawing operations are initiated by the C code in the XGA driver. Instead, a group of assembler-language functions (pDrawFunctions) are called and they complete all of the real drawing work. These functions are:

eddh_DestOnlyBlt
eddh_DrawText
eddh_PatDestBlt
eddh_PMIMAGEDATA
eddh_PMLINES
eddh_PMPLOTSTEP
eddh_PMSCANLINE
eddh_SrcDestBlt
eddf_DestOnlyBlt
eddf_DrawText
eddf_PatDestBlt
eddf_PMIMAGEDATA
eddf_PMLINES
eddf_PMPLOTSTEP
eddf_PMSCANLINE
eddf_PMSHORTLINES
eddf_SrcDestBlt

These functions fall into two groups: one for drawing to the display, and one for drawing to memory. Pointers to these functions are stored in two tables in the EDDFDADA.C file. The hardDrawTable contains pointers to the functions that start with "eddh." The softDrawTable contains pointers to the functions that start with "eddf."

These functions are important as they represent the bulk of the hardware-dependent code in the driver. The following is a listing of the source files associated with each function:

EDDHBBLT.ASM 
eddh_DestOnlyBlt
eddh_PatDestBlt
eddh_SrcDestBlt
eddf_DestOnlyBlt
eddf_PatDestBlt
eddf_SrcDestBlt
EDDHGCHS.ASM 
eddh_DrawText
eddf_DrawText
EDDHIMAG.ASM 
eddh_PMIMAGEDATA
eddf_PMIMAGEDATA
EDDHLINE.ASM 
eddh_PMLINES
eddh_PMPLOTSTEP
eddf_PMLINES
eddf_PMPLOTSTEP
EDDHSCAN.ASM 
eddh_PMSCANLINE
eddf_PMSCANLINE
EDDHSHRT.ASM 
eddh_PMSHORTLINES
eddf_PMSHORTLINES

Note that each file lists two versions of each routine: one for hardware drawing, and the other for drawing to bit maps. The code for soft-draw and hard-draw is similar in all of these files. Each of these files is assembled twice, once with HARD_DRAW defined, and once with SOFT_DRAW defined. The conditional assembler in the files renames the major functions, and inserts calls to the bit-map drawing routine, EDDF_MESS. (EDDF_MESS is located in the DDFFAST.ASM file.) The remainder of the differences are hidden in the memregread and memregwrite assembler macros. In soft-draw mode, memregread and memregwrite write to the ShadowXGARegs structure. The pointer used by memregread and memregwrite is set up in the movxga macro, which gets the pointer to the registers from pMemReg. (pMemReg is set to point to the ShadowXGARegs structure by the SetDrawModeSoft C macro.) The following is an example from the EDDHBBLT.ASM file, from the start of the function SrcDestBlt (SrcDestBlt is changed by way of conditional assembler to eddh_SrcDestBlt or eddf_SrcDestBlt, depending on whether the module is built for hard-draw mode, or for soft-draw mode.):

        pushxga

; get the base address of the hardware registers
        movxga  pXGARegs

; use ebx as a pointer to the destination bit map header
        mov     ebx, AIxfer.pbmhDest

; use edi to count the number of clip regions
        movzx   edi, word ptr AIxfer.cDestClipRects

; select the destination map
        memregwrite   pi_map_index_A, SEL_PIX_MAP_A

In hard-draw mode, pMemReg points to the XGA registers. Most of the other conditional assembler controls minor differences between hardware and software-drawing modes (for example, not waiting for completion of the previous hardware-drawing operation in software-drawing mode).

None of the assembler-language drawing functions are directly called by the XGA driver's C-code functions; they are always called through the pDrawFunctions table. Again, this usually allows the same code to process the setup for either the display or bit-map drawing operations. In the glyph-caching code example, the state of the softDrawInUse flag is preserved, set to FALSE, and then restored. The call to TransferShadowRegisters initiates a BitBlt to copy the glyph into VRAM. TransferShadowRegisters only touches the hardware if the driver is in hard-draw mode, that is, if softDrawInUse is FALSE. The character-caching code maintains identical caches in system memory and off-screen VRAM. Because the bit-map version of the text-drawing code calls the eddt_CacheCharacter() routine (the routine shown in the glyph-caching code example), there will be cases in which the eddt_CacheCharacter routine needs to use the hardware to copy the glyph to VRAM, even though it is in soft-draw mode.

In other words, high-level functions in the driver that are in C language, such as eddb_BitBlt(), set the mode for hard-draw mode or soft-draw mode and then call lower level C functions such as PixBltThroughClips(). These functions are relatively generic. Setup values in the ShadowXGARegs structure can be used to initialize the XGA chip set, or they can be used to interpret a bit-map drawing by the eddf_MESS() file.

PixBltThroughClips() (and other level C functions) conducts slightly more setup, calls the TransferShadowRegisters structure, and then calls one of the assembler-language worker functions by way of the pDrawFunctions table. The assembler-language drawing routines are assembled for either hardware drawing or software drawing. The software drawing versions call eddf_MESS() to draw to bit maps, while the hardware-drawing versions write to the XGA command register, which initiates a drawing operation.

The Physical Chunk and Bus-Master Operations

The XGA driver supports a 64KB, 1MB, or 4MB aperture into VRAM. However, the XGA presentation display driver does not use these apertures. Instead, it transfers bit maps from system memory to VRAM using the XGA driver's bus-master capabilities. The XGA driver has a sophisticated memory-management unit, and is capable of dealing with virtual addresses. In OS/2, there is no guarantee that a given page of memory will be resident in memory at any given time. Therefore, to transfer a bit map to VRAM by way of a bus-master, you must first lock it down. Because this is time-consuming, the XGA driver locks down 64KB of system memory and performs any bus-master operations to or from VRAM using this 64KB buffer. This 64KB of memory is referred to as the Phunk (physical chunk).

The Phunk is created in eddb_CreatePhunk (located in the EDDBPHNK.C file). First, eddb_CreatePhunk allocates 64KB of memory. It then executes the FlatLock macro, from the EDDMACRO.H file. The FlatLock macro is an IOCtl to the XGA ring 0 driver. The IOCtl call takes the virtual address of the 64KB buffer, calls the Dev_Hlp VMLock, which locks the buffer into system memory, and then returns the physical address of the 64KB buffer to the caller. VMLock is limited to locking down 64KB of contiguous memory, which explains the size of the Phunk.

The PixBltThroughClipsViaPhunk() function is the primary user of the Phunk and is located in PIXBLT.C. PixBltThroughClipsViaPhunk is called by eddb_ BitBlt() (in EDDNBBLT.C), which is the driver entry point for BitBlt. The XGA driver uses the Phunk whenever the source bit map is in system memory and the destination is the video memory, or whenever the source bit map is in video memory and the destination is in system memory. In other words, if the Blt involves a transfer to or from the video memory to system memory, it goes through the Phunk. PixBltThroughClipsViaPhunk() reduces the transfer into 64KB-size pieces and performs clipping on the destination by enumerating through the clipping rectangles. It then calls whichever low-level blt function has been set up by eddb_BitBlt(). This process is repeated for each 64KB of the bit map.

Cache Management

The XGA presentation display driver caches bit maps, fonts, line-patterns, markers, and dithered patterns for BitBlt in off-screen VRAM. Off-screen VRAM allocation is divided between two files, namely EDDEVRAM.C and EDDNCACH.C. The EDDEVRAM.C file stores various patterns and markers into off-screen memory. The EDDNCACH.C file contains the code for maintaining the font and bit-map caches.

The XGA presentation driver is able to cache small bit maps in off-screen VRAM. It limits the size of these bit maps to 1600 bytes (refer to EDDNCACH.H), which is large enough for a 40 x 40 pel icon at 8 bits-per-pel, but not large enough for a 32 x 32 pel icon at 16 bits-per-pel. After the XGA presentation display driver reserves off-screen video memory for the software cursor, the font cache, patterns, and so forth, it uses what remains for caching small bit maps. The bit map cache is created by the initialise_bm_cache() function. This cache takes the remaining off-screen VRAM, divides it into 1600-byte chunks, and creates an array of structures of type BMCACHE with that number (the remaining off-screen VRAM divided by 1600) of elements. Bit maps either fit in the cache, or they do not fit in the cache. If they fit in the cache, they take the entire 1600 bytes of cache. For example, a 32 x 32 monochrome bit map, if cached, occupies a 1600 byte slot in the cache even though it is only 128 bytes. The only variable is the number of bit maps that may be cached, and that number is determined at system startup.

The following bit maps are cached in PixBltThroughClipsViaPhunk() (PIXBLT.C ):

if ( !fSeamlessActive &&
     !AIxfer.fStretching &&              /* <DCRTURBO> */
     (AIxfer.pbmhSrc->BMPhys == NULL) &&
     (AIxfer.pbmhDest->bit map == NULL) &&
     (ShadowXGARegs.ColCompCond != COLCOMP_SRC_NOT_EQUAL) &&
     !UsePaletteMapping &&
     AIxfer.pbmhSrc != &DrawBitsBMHeader )
{
  if ( AIxfer.pbmhSrc->BMSize <= VRAM_BM_CACHE_SIZE )
  {
    if ( cache_bit map(AIxfer.pbmhSrc) )
    {
       //Caching the bit map may have corrupted some of our Color and Mix
       //Registers.  Restore them before calling PixBltThroughClips().
       TransferShadowRegisters( TSR_COLOUR_MIX );
       PixBltThroughClips();
       return;
    }
  }
}

If cache_bitmap() is able to cache the bit map, PixBltThroughClipsViaPhunk() can call PixBltThroughClips because both the source and destination for the BitBlt are in video memory. PixBltThroughClips enumerates through the clipping rectangles (if any), and performs a screen-to-screen BitBlt.

Cache_bit map() always returns TRUE. Cache_bit map does not check the size of the bit map, which is why the code in the previous figure looks for an empty slot in the cache. If it is unable to find one, it calls evicted_cache_slot(), which deletes a bit map out of the cache. The following is code for evicted_cache_slot():

ULONG evicted_cache_slot(VOID)

   /**********************************************************************/
   /* Evicts a bit map from the cache and returns its slot number for    */
   /* use by a new bit map.                                              */
   /*                                                                    */
   /**********************************************************************/

{

  evict_cached_bit map(next_eviction);

  if ( next_eviction == 0 )
  {
    next_eviction = max_cached_bit maps - 1;
    return(0);
  }
  else
  {
    return(next_eviction--);
  }

} /* evicted_cache_slot */

The eviction scheme uses round-robin scheduling when evicting bit maps from the cache. It is possible that the bit map evicted will be the one just allocated on the previous call. This could lead to a certain amount of cache thrashing.

The font cache packs glyphs together in the font cache, rather than creating a fixed number of fixed-size slots for glyphs. Instead, there are a maximum of 10 fonts that may be cached. At initialization time, an array of FONTCACHEINFO structures is allocated in CreateFontCache() (in the EDDNCACH.C file). FONTCACHEINFO is defined in EDDTYPET.H, and is as follows:

   /*********************************************************************/
   /* Type definition for FontCacheInfo                                  */
   /**********************************************************************/

#define MAX_NUM_CODEPOINTS 256

typedef struct _FONTCACHEINFO { /* fci */
        FOCAMETRICS     fmFontMetrics;
        USHORT          usHashMetrics;
        USHORT          usUsageCount;
        USHORT          usFontID;
        USHORT          usCodePage;
        ULONG

aulCachedCharOffset[MAX_NUM_CODEPOINTS];
#ifdef FULL_ADDRESS
        PVOID           apCharDef[MAX_NUM_CODEPOINTS];
#else /* ndef FULL_ADDRESS */
        USHORT          apCharDef[MAX_NUM_CODEPOINTS];
#endif /* ndef FULL_ADDRESS */

   /**********************************************************************/
   /* Defect 75206. 800 x 600 x 16bit in 1MB VRAM does not leave enough  */
   /* room to cache large AVio fonts in one plane.  For this resolution, */
   /* reduce the number of cached fonts to 8 and allow them to wrap      */
   /* (once) to plane index + 8.  For each character, ausPlaneOffset is  */
   /* set to:                                                            */
   /* 0 - in plane 0 through 7                                           */
   /* 1 - in plane 8 through 15 */ /* -1 - not in 16-bit color, ignore   */
   /**********************************************************************/

#ifdef  _8514
        USHORT ausPlaneOffset[MAX_NUM_CODEPOINTS];
#endif
    } FONTCACHEINFO;

typedef FONTCACHEINFO * PFONTCACHEINFO;

As the driver draws text strings, it determines if the font is already in the cache. If not, it takes an entry out of the array of FONTCACHEINFO structures. It then checks each character in the string to be drawn, and if they are not cached (the entry in the aulCachedCharOffset array for that codepoint is NULL), it puts each character for the string into the cache. If the 64KB cache overflows, the entire cache is flushed, and all of the characters for the string are cached. The details of this process are in the EDDNGCGS.C, EDDHGCHS.ASM, and EDDNCACH.C files.

The 8514/A Driver

Although the XGA is a functional superset of the 8514/A, the devices have the following differences.

8514/A XGA
Drawing engine registers use I/O ports Registers are memory-mapped.
References objects by Cartesian coordinates References objects in VRAM by address using drawing commands
VRAM is arranged into bit planes VRAM is not arranged into bit planes

Because of the XGA driver's bit-map drawing code design, throughout the code there are references to XGA registers that the 8514/A does not possess.

IFDEF _8514

This section highlights the most significant differences between the XGA and the 8514/A 32-bit presentation drivers, especially those that relate to the S3 driver. (The S3 chip is a streamlined 8514/A design with a VGA chip core.)

The 8514/A driver's register set is different from the XGA driver's register set. In the original XGA driver, it was difficult to determine what to change when adapting the driver to another chip because so much of the code was identical for hard-draw and soft-draw modes. For example, if you went into a file such as EDDHBBLT.ASM, and started replacing memregwrite's macro with a particular video chip that was needed, you had to be careful not to break the bit-map drawing code. The EDDHBBLT.ASM file is assembled twice, once for drawing to the display, and once for drawing to bit maps. The bit map drawing code for the 8514/A driver emulates an XGA, while the display code works with the 8514/A hardware. As a result, files such as EDDHBBLT.ASM tend to have the following structure:

IFDEF HARD_DRAW
 DoSomething   equ    _eddh_DoSomething  ;do something to the display
ELSE    ;SOFT_DRAW
 DoSomething   equ    _eddf_DoSomething  ;do something to a memory bit map
ENDIF

DoSomething     proc near
;       do some generic setup common to both
        memregwrite
        memregread
        memregwrite
;       etc., etc.

IFDEF HARD_DRAW
IFDEF _8514
;       8514/A specific setup
        .
        .
        .
ENDIF   ;_8514
ENDIF   ;HARD_DRAW

;       more setup common to both
        memregwrite
        memregwrite
        memregwrite
        memregwrite
        .
        .
        .
        memregwrite     pixel_op, ...

IFDEF   HARD_DRAW
IFDEF   _8514
        outwq   X1, ax
        outwq   X2, bx
        .
        .
        .
        outwq   CMD_FLAGS, ...  ;do the 8514/A command
ENDIF   ;_8514
ELSE    ;SOFT_DRAW
        saveregs
        call    _eddf_MESS
        restoreregs
ENDIF   ;SOFT_DRAW

        ret
DoSomething     endp

Shadow8514Regs Structure

The 8514/A driver also uses a data structure to hold register values that will be written to the hardware. In the XGA driver, this structure was called ShadowXGARegs. In the 8514/A driver, it is called the Shadow8514Regs structure. The definition for the Shadow8514Regs is located in the XGAADAPT.H file.

typedef struct _MM8514Reg
    {

   /**********************************************************************/
   /* DRAWING CONTROL REGISTERS                                          */
   /**********************************************************************/

       /**************************************************************/
       /* X0 and Y0 are the current position values used as the      */
       /* starting point for all drawing operations.  Most of the    */
       /* drawing ops update this point during executions.  Any      */
       /* attempt to read these values while drawing could result    */
       /* in meaningless data.                                       */
       /**************************************************************/

      volatile USHORT X0;       /* 86e8 r/w */
      volatile USHORT Y0;       /* 82e8 r/w */

       /**************************************************************/
       /* The following two registers are dual purpose registers     */
       /* that are used for both Line drawing and BitBlt.            */
       /*                                                            */
       /* When used during a Blt, X1 and Y1 specify the target       */
       /* rectangle coordinates.                                     */
       /*                                                            */
       /* When used during a Line drawing op, K1 and K2 specify the  */
       /* axial step constant and the diagonal step constant,        */
       /* respectively. These values can be calculated as follows:   */
       /*                                                            */
       /* K1 = 2 * (minor axis delta)                                */
       /* K2 = [2 * (minor axis delta)]                              */
       /*      [2 * (major axis delta)]                              */
       /**************************************************************/

        volatile USHORT  X1;                /*  8ee8   w */
        volatile USHORT  Y1;                /*  8ae8   w */

        volatile USHORT  K1;                /*  8ae8   w */
        volatile USHORT  K2;                /*  8ee8   w */

    /*********************************************************************/
    /* Error Term is used during line drawing.  The error term           */
    /* is calculated as:                                                 */
    /*                                                                   */            */
    /* Err_Term = [2 * (minor axis delta)]
    /*                (major axis delta) - fixup                         */
    /*********************************************************************/

        volatile USHORT Err_Term; /* 92e8 r/w */

    /*********************************************************************/
    /* Note :  LY, the counter part to LX, is accessed through Index 0   */
    /* of the Multifunction Control Register, BEE8.                      */
    /*********************************************************************/

      volatile USHORT LX;                 /* 96e8    w */

    /*********************************************************************/
    /* These two registers are used to pass commands to the Display      */
    /* Processor and to check the status of the command queue.           */
    /*                                                                   */
    /* Commands that can be initiated are:                               */
    /*                                                                   */
    /* 000 No Operation Performed                                        */
    /* 001 Line Draw                                                     */
    /* 010 Fast-Fill Rectangle                                           */
    /* 011 Fill Rectangle Vertically (#1)                                */
    /* 100 Fill Rectangle Vertically (#2)                                */
    /* 101 Draw Line for Area Fill                                       */
    /* 110 Copy Rectangle                                                */
    /* 111 reserved                                                      */
    /*                                                                   */
    /* Before initiating a command, all of the attributes registers      */
    /* necessary for the primitive should be set up first.               */
    /*                                                                   */
    /* The QStatus register indicates whether or not a command is        */
    /* currently being executed and how many of the eight slots in the   */
    /* queue are currently used.                                         */
    /*                                                                   */
    /* Queue State Value Meaning                                         */
    /* ----------------- -------                                         */
    /* 00000000 Queue Empty                                              */
    /* 00000001 7 Entries available                                      */
    /* 00000011 6 Entries available                                      */
    /* 00000111 5 Entries available                                      */
    /* 00001111 4 Entries available                                      */
    /* 00011111 3 Entries available                                      */
    /* 00111111 2 Entries available                                      */
    /* 01111111 1 Entries available                                      */
    /* 11111111 Queue Full                                               */
    /*********************************************************************/

        volatile CMDFLAG Cmd_Flags;         /*  9ae8   w */
        volatile QSTATUS QStatus;           /*  9ae8   r */

    /*********************************************************************/
    /* Attribute Registers                                               */
    /*********************************************************************/

        volatile SSTROKE ShortStroke;       /*  9ee8   w */

        #ifndef   BPP24
        volatile USHORT  Color_0;           /*  a2e8   w */
        volatile USHORT  Color_1;           /*  a6e8   w */

        volatile USHORT  Write_Enable;      /*  aae8   w */
        volatile USHORT  Read_Enable;       /*  aee8   w */

        volatile USHORT  Color_Comp;        /*  b2e8   w */
        #else
        volatile ULONG  Color_0;            /*  a2e8   w */
        volatile ULONG  Color_1;            /*  a6e8   w */

        volatile ULONG  Write_Enable;       /*  aae8   w */
        volatile ULONG  Read_Enable;        /*  aee8   w */

        volatile ULONG  Color_Comp;         /*  b2e8   w */
        # endif

        volatile MIX  Function_0;           /*  b6e8   w */
        volatile MIX  Function_1;           /*  bae8   w */

    /*********************************************************************/
    /* The register at BEE8 is a multifunction control register for      */
    /* drawing operations.  The different functions are differentiated   */
    /* by an index value in the high four bits.                          */
    /*********************************************************************/

        volatile USHORT  LY;                /*  bee8   w Index 0  */
        volatile USHORT  YMin;              /*  bee8   w Index 1  */
        volatile USHORT  XMin;              /*  bee8   w Index 2  */
        volatile USHORT  YMax;              /*  bee8   w Index 3  */
        volatile USHORT  XMax;              /*  bee8   w Index 4  */
        volatile USHORT  Config;            /*  bee8   w Index 5  */
        volatile USHORT  Pattern_0;         /*  bee8   w Index 8  */
        volatile USHORT  Pattern_1;         /*  bee8   w Index 9  */
        volatile PIXMODE Mode;              /*  bee8   w Index A  */

    /*********************************************************************/
    /* Color_0_Wait is the only mechanism for moving data to and from    */
    /* the 8514's VRAM.  The register is capable of dealing with Pixel   */
    /* mode and Planar mode color data.  The type used is determined by  */
    /* Config version of the Multifunction Control Register, BEE8.       */
    /*********************************************************************/

        volatile USHORT  Color_0_Wait;      /*  e2e8  r/w */

    /*********************************************************************/
    /* When read, this register is used to query information about the   */
    /* graphics processor.  Information available is whether or not...   */
    /*                                                                   */
    /* 1.) The graphics processor is idle.                               */
    /* 2.) A command was written to a full queue.                        */
    /* 3.) The Color_0_Wait register was read without any data being     */ 
    /*     available to read.                                            */
    /* 4.) A write to a pixel within the clipping rectangle is about     */ 
    /*     to be made.                                                   */
    /*                                                                   */
    /* When written to, this register is used to enable and/or reset     */
    /* any of the above registers.                                       */
    /*********************************************************************/

//        volatile STATCTL Control;            /*  42e8   w */ 
//        volatile STATCTL Status;             /*  42e8   r */ 
        volatile USHORT  Control;            /*  42e8   w   */ 
        volatile USHORT  Status;             /*  42e8   r   */ 

    /*********************************************************************/
    /* The register selects which 4KB page of the 32KB ROM               */
    /* on-board the IBM 8514/A is mapped to memory.  This only applies   */
    /* to IBM* Micro Channel* computers that use the power-on self test. */
    /* We should be able to ignore this register for other computers.    */
    /* However, we must be careful not to cause a conflict with VGA ROMs.*/
    /*********************************************************************/

        volatile USHORT  Prom_Page;         /*  46e8   w */

    /*********************************************************************/
    /* These two main purposes of this register are to set the clock     */
    /* speed, 25.175 MHz or 44.9 MHz, and to enable or disable the       */
    /* VGA pass-thru feature of the 8514/A.                              */ 
    /*********************************************************************/

        volatile MISCIO  Misc_IO;           /*  4ae8   w  */

    /*********************************************************************/
    /* Start extra pixmap definitions here. One each for A, B, C and the */
    /* mask. The arrangement of these is an exact image of the real      */
    /* registers. (As a result, the padding was included).               */
    /* Note that these 'shadow' registers are always in memory so        */
    /* need not be declared as volatile.                                 */
    /*********************************************************************/

        USHORT  OpDim1;
        USHORT  OpDim2;

        USHORT  MaskXOffset;
        USHORT  MaskYOffset;

        USHORT  SrcXAddr;
        USHORT  SrcYAddr;

        USHORT  PatXAddr;
        USHORT  PatYAddr;

        SHORT    DstXAddr;
        SHORT    DstYAddr;

        ULONG       PixOp;
        BYTE        bFifthStep;   /* used by software simulation only */

        BYTE    pixmapIndexA;
        BYTE    bPaddingA;
        ULONG   pixmapBaseA;
        USHORT  pixmapWidthA;
        USHORT  pixmapHeightA;
        BYTE    pixmapFormatA;
        BYTE    bPaddingA2;

        BYTE    pixmapIndexB;
        BYTE    bPaddingB;
        ULONG   pixmapBaseB;
        USHORT  pixmapWidthB;
        USHORT  pixmapHeightB;
        BYTE    pixmapFormatB;
        BYTE    bPaddingB2;

        BYTE    pixmapIndexC;
        BYTE    bPaddingC;
        ULONG   pixmapBaseC;
        USHORT  pixmapWidthC;
        USHORT  pixmapHeightC;
        USHORT  pixmapFormatC;
        BYTE    bPaddingC2;

        BYTE    pixmapIndexM;
        BYTE    bPaddingM;
        ULONG   pixmapBaseM;
        USHORT  pixmapWidthM;
        USHORT  pixmapHeightM;
        BYTE    pixmapFormatM;
        BYTE    bPaddingM2;
   } MM8514Reg;

typedef MM8514Reg FAR * pMM8514Reg;

The top 3/4 part of this structure corresponds to 8514/A registers. However, immediately after the volatile MISCIO Misc_IO field, notice the definitions of several XGA registers. These have to exist because of the bit-map drawing code in the module named eddf_MESS(). The shadow registers are named Shadow8514Regs in the 8514/A driver. The following is an example of conditionally compiled code:

       /**************************************************************/
       /* set the foreground and background colors for the blt.      */
       /* These are taken from the target attribute bundle           */
       /* unless colour information was passed in the parameters     */
       /**************************************************************/

#ifndef   _8514
if (ArgOptions & BLTMODE_ATTRS_PRES)
{
    ShadowXGARegs.FgCol = (USHORT)LogToPhyIndex(ArgAttrs->lColor);
    ShadowXGARegs.BgCol = (USHORT)LogToPhyIndex(ArgAttrs->lBackColor);
}
else /* use attribute bundle colours */
{
    ShadowXGARegs.FgCol = (USHORT)pdc->DCIImagColatts.ForeColor;
    ShadowXGARegs.BgCol = (USHORT)pdc->DCIImagColatts.BackColor;
}
#else

       /**************************************************************/
       /* set the foreground and background colors for the blt.      */
       /* These are taken from the target attribute bundle           */
       /* unless color information was passed in the parameters      */
       /**************************************************************/

if (ArgOptions & BLTMODE_ATTRS_PRES)
{
    Shadow8514Regs.Color_1 = LogToPhyIndex(ArgAttrs->lColor);
    Shadow8514Regs.Color_0 = LogToPhyIndex(ArgAttrs->lBackColor);
}
else /* use attribute bundle colours */
{
    Shadow8514Regs.Color_1 = pdc->DCIImagColatts.ForeColor;
    Shadow8514Regs.Color_0 = pdc->DCIImagColatts.BackColor;
}
#endif

In addition to changing the name of the structure, the fields that corresponded closely between the two devices were renamed to match the 8514/A structure. The 8514/A and XGA support the same binary raster operations, although they are encoded differently. The MESS code understands 8514/A raster operations, except for the line drawing code, which still uses XGA raster operations. The MESS code does not understand 8514/A commands, however. Thus, the 8514/A driver code must set up XGA pixel operations because the MESS code depends on them. Also, parts of the 8514/A hardware-drawing code reads the XGA pel operation that is set up by the higher-level portions of the driver, and also sets appropriate bits in the 8514/A command register. Blt directions for overlapping Bitblts are handled in this manner.

The result of this process is the bit-map drawing code, which should be relatively device-independent, is dependent on both 8514/A and XGA hardware at the same time. The MESS code is dependent on certain fields in the bit-map header that really should be reserved for the hardware-drawing code. In particular, the MESS code assumes that the hw_width field (the width of the bit map in hardware) is one less pel than the width of the bit map. The 8514/A and XGA refer to widths and heights in 0-based units. Therefore, 0 means 1, 1 means 2, and so on. Not all drivers work this way.

TransferShadowRegisters

Although much of the C code in the driver references 8514/A or XGA registers in the Shadow8514Regs structure, only a few pieces of code in the driver actually write to 8514 registers. Of these, the function TransferShadowRegisters(), in the HWACCESS.ASM file is one of the most important. The TransferShadowRegisters function copies values from the shadow registers to the hardware. The TransferShadowRegisters function knows which values in the shadow register structure have been altered by relying on the caller to tell it which registers to update. Bits set in the argument to the TransferShadowRegisters function handle which registers are copied from the shadow registers to the hardware.

Unlike the XGA driver, which occasionally sets the bit TSR_PIXELOP (and initiates an XGA drawing operation) the 8514/A driver never initiates a drawing operation by way of the TransferShadowRegisters function. Indeed, most of the time TransferShadowRegisters is called with an argument of TSR_COLOUR_MIX. This argument sets the 8514/A foreground and background color, the foreground and background raster operation, color compare, hardware pattern, and bit-map format for the source. It also determines whether the source is VRAM, a color register, or data from the pel port. The 8514/A driver also uses the TSR_COORDINATES argument to the TransferShadowRegisters function. This sets up the starting X-Y coordinate for the drawing operation, as well as the extents of the operation.

Bit-Map Addresses and the VRAM Marker

One fundamental difference between the XGA and 8514/A drivers is that the XGA refers to objects in VRAM by their address as a pixmap and the 8514/A refers to them by an X-Y coordinate. Further, monochrome data on the 8514/ A can be stored not only by an X-Y coordinate but by planes as well. For instance, at 8 bits-per-pel, a 20 x 20 area of the display could hold one 20 x 20 8-bit-per-pel bit map, or eight 20 x 20 1-bit-per-pel bit maps.

Another difference between the XGA and 8514/A drivers is that on the XGA display driver, a pixmap may have any pitch at all, while on the 8514/A, the pitch of a rectangle in VRAM is always the same as the length of a scanline. In effect, the 8514/A has only one pixmap that occupies all of VRAM.

Given that the XGA uses memory addresses to refer to objects in VRAM, the 8514/A refers to them by using an artificial address in VRAM, which is created from the X-Y coordinates. It then is marked so it can be distinguished from a system-memory bit map. The address of a bit map in VRAM for the original 8514/A driver is as follows:

address = 0f0000000h or (y-coord * 1024 + x-coord)

The 0f0000000h is a special VRAM marker. Any of the lower-level routines that can work with either system or VRAM bit maps, look at the bit-map address, and if the high-order nibble is 0fh, then they know that address refers to VRAM. To convert the address back to X-Y coordinates, the calc8514xy macro is called. This macro logically ANDs the address with 0fffffffh, and divides by 1024, which yields an X-Y coordinate.

The S3 driver creates addresses in VRAM out of X-Y coordinates as follows:

address = 0f0000000h or (y-coord * 65536 + x-coord)

This is a more efficient computation for the assembler-language portions of the driver to perform.

The Phunk

When the XGA driver has to perform a BitBlt from system memory to VRAM, it copies up to 64KB of the source into the Phunk, and then completes a BitBlt from the Phunk to the screen by way of a bus-master. The 8514/A does not have the ability to process bus-master transfers. The 8514/A driver still uses the Phunk, however. For the 8514/A driver, the Phunk is a 64KB memory buffer that is not locked down. The 8514/A driver copies up to 64KB of source data into the Phunk, by way of eddf_MESS(), and then copies from the Phunk to VRAM by way of software through the 8514/A pel port. For the 8514 /A, the major impact of this process on the low-level routines of the driver is that these routines have to deal with a source or destination that is system memory; whereas the XGA low-level routines do not. (Refer to BitBlt, which describes the uses of the Phunk in more detail.)

The S3 Driver

The S3 driver differs from the 8514/A driver in its support of 16- and 24-bit-per-pel modes, multimedia escapes, and other changes that are required because of the design of S3 chips.

Driver Initialization

The Driver DLL loadproc is located in the DYNA32.ASM file. The loadproc is called once, when the driver module is loaded, and then calls only XGA_ DLLInit. Also located in the DYNA32.ASM file is haltproc(), which is a routine that consists of an INT 3. It can be called from C code and may be useful for debugging (although _asm {INT 3} is capable of performing the same function).

XGA_DLLInit saves the driver module handle, and then creates the driver semaphore. The creation of the semaphore occurs in the SEAMLESS.C file, in the SeamlessInitSem() function. The semaphore is a fast, safe-ram semaphore that is used by the driver to serialize access to the display hardware. This semaphore prevents contention between the seamless windows driver and the presentation display driver and between the various threads of GRE that may attempt to simultaneously enter the driver.

Initialization

The first hardware-dependent task in a PM driver is to initialize the video hardware and set it into a graphics mode. This is slightly more complicated with the S3 driver than with the SVGA driver (as an example) because the S3 driver supports multiple resolutions and color depths. The following is the logical sequence of functions in the driver that perform initialization, followed by commentary about them. There is more to initialization of the driver than what is shown below, but most of what is not covered is not hardware-dependent. (Refer to the OS/2 Presentation Device Driver Reference for further information.)

OS2_PM_DRV_ENABLE() - eddenabl.c EXPORT @200
FillLdb() - Fill Logical Device Block (eddefldb.c)
        if (first_time) {
        FillPdb() - Fill Physical Device Block (eddefpdb.c)
                QueryAndSelectNativeMode() - (eddesres.c)
                        Determine adapter memory size
                        SetObtainableModes() - (modeinfo.c)
                        Get_User_Mode()
                        DMS32CallBack((PFN) SwitchToChosenMode)
                        KlugeReset() -  (hwaccess.asm)
                        CacheManager() - (cacheman.c)
                SwitchToExtendedGraphicsMode() -  (setmode.c)
                        DMS32CallBack((PFN) SwitchToChosenMode)
                        KlugeReset() -  (hwaccess.asm)
                InitialiseVRAM() - (eddevram.c)
                CreateFontCache() - (eddncach.c)
                InitialiseSeamless() - (seamless.c)
        initialise_bm_cache() - (eddncach.c)
        }
        GetVRAMPointer()
        Pass back pointers to driver functions

OS2_PM_DRV_ENABLE() Exported entry, ordinal 200

This function is the main entry point for the driver. It calls one of several functions based on the subfunction that is passed. Two of the most important are subfunctions 1 (Fill Logical Device Block) and 2 (Fill Physical Device Block).

FillLdb() - Fill Logical Device Block

Fill Logical Device Block is called by every process that attaches to the display driver's .DLL. This subfunction's primary responsibility is to perform any per-process initialization, and pass back a list of hooked functions to the GRE.

The first time FillLdb() is called, it then calls FillPdb(), which selects and sets the video mode, and initializes the bit-map cache. After completion of this process, the FillLdb() subfunction calls GetVRAMPointer() to obtain a flat pointer to the aperture. GetVRAMPointer calls PMDD.SYS to obtain the pointer. GetVRAMPointer is only valid in the context of the process that called it. This is a problem. GRE is multithreaded, so multiple processes will attach to the driver's .DLL. If a process other than the one that called GetVRAMPointer() attempts to use the pointer to the aperture, the driver will generate an invalid address exception. To solve this problem, each process that attaches to the driver must have the pointer to the aperture made valid in its context. This is achieved via VMGlobalToProcess. Since FillLdb() is called each time a thread attaches to the driver, FillLdb() is a convenient place to call GetVRAMPointer.

The first time the driver is entered, GetVRAMPointer calls PMDD.SYS to allocate a flat pointer to the aperture. On subsequent calls, GetVRAMPointer attaches to the existing pointer, making it valid in that particular process address space.

Finally, FillLdb() passes back a table of hooked functions to the GRE.

FillPdb() - Fill Physical Device Block

FillPdb can be called in two different ways. Initially, it is called by FillLdb(). It is also a subfunction of OS2_PM_DRV_ENABLE, and as such, it is called at other times by the GRE. The device contexts for the display driver do not require separate physical device blocks. The initial call, however, calls QueryAndSelectNativeMode() to determine which video mode to set, and then switches to that mode by way of SwitchToExtendedGraphicsMode(). After setting the mode, it calls InitialiseVRAM(), which sets up locations in off-screen VRAM for the hardware and software cursor, dithered patterns, the bit-map cache, and the font cache. Next, it calls CreateFontCache(), which sets up the font cache. Finally, it calls InitialiseSeamless to enable the driver's seamless windows capability.

QueryAndSelectNativeMode() - Find the graphics mode selected by user

This function reads the mode that has been selected from OS2.INI, and determines which modes from the asr array are valid. It then matches the user-selected mode to the modes available in the asr array. The asr array is used by QueryAndSelectNativeMode, as well as by OS2_PM_DRV_ QUERYSCREENRESOLUTIONS(), which is the driver entry point used by static mode set. (See Multiple Resolution Support for more details.) The asr is an array of structures of type:

typedef struct _SCREENRESOLUTION {
   ULONG width;
   ULONG height;
   ULONG colors;
   ULONG planes;
   ULONG floptions;
} SCREENRESOLUTION;

QueryAndSelectNativeMode() initially determines the memory size of the S3 adapter. Having done this, it then calls SetObtainableModes(), which reads the SVGADATA.PMI file for the S3 adapter. Then, it sets the DSP_RESOLUTION_ OBTAINABLE (bit 6) of floptions for each resolution in the SCREENRESOLUTION table that is also found in the .PMI file. Next, QueryAndSelectNativeModes() calls Get_User_Mode, which reads OS2.INI and looks for a user-selected resolution. If it is unable to find one, QueryAndSelectNativeModes() searches the asr array for a mode marked as the default. In the case of the S3 driver, the video mode is 640 x 480 x 8 bits-per-pel.

At this point, the S3 driver calls the base video handler to set the mode and then calls KlugeReset(), which sets up the drawing engine and clears the display. (The mode is set again in FillPdb.) Next, QueryAndSelectNativeMode() searches the aResTable for a mode that matches the resolution and color depth requested by the user, and the amount of memory available on the card. When it finds a match, it sets the global variable HWResolution, and then calls CacheManager(), which then sets up pointers to hardware and cache maps. (Further information on CacheManager and the hardware and cache maps can be found in Cache Management.)

Multiple Resolution Support

The S3 driver supports multiple resolutions and color depths that are contained in a single driver. To change resolutions, the user selects the resolution from a page in the Settings notebook for the System setup object. (Making a selection is easier than reinstalling new drivers for each resolution or color depth.) The key to this capability is a documented driver entry point at ordinal 202. At 8514_32.def, the ordinal 202 is used by OS2_PM_DRV_QUERYSCREENRESOLUTIONS. After searching, this function is found in the EDDQSRES.C file.

To obtain the resolutions supported by the video driver, exported entry 202 is called by the system's Settings notebook page. This function copies back to the caller the asr[] array created in the EDDESRES.C file. Those entries that have the asr[]floptions flag DSP_RESOLUTION_OBTAINABLE, will be available for a user to select from the Settings notebook. The asr[] array is initialized in SetObtainableModes(), in the MODEINFO.C file, as part of the driver initialization process. When this array is copied back to the caller, the user chooses one of the resolutions given, and upon restarting OS/2, the driver brings up the new resolution.

The exported entry point at ordinal 202 gives a mechanism for the driver to tell the Settings notebook which resolutions it supports. By way of OS2.INI, the Settings notebook can tell the display driver which resolution was selected. Installing the S3 driver adds and modifies a number of keys in OS2.INI. The following keys are added under PM_DISPLAYDRIVERS:

Key Value
IBMS332 IBMS332
CURRENTDRIVER IBMS332
DEFAULTDRIVER IBMS332

The added keys sets the S3 presentation driver DLL to be the current driver, and is the same procedure used for installing the SVGA driver. However, the system's Settings notebook adds an additional key called DEFAULTSYSTEMRESOLUTION. This key has as its value, a data structure of type SCREENRESOLUTION, which corresponds to an entry in the asr[] array. Based on which resolution is chosen, the value of the key is set. Upon initialization, in QueryAndSelectNativeMode, the display driver calls the function Get_User_Mode() (EDDESRES.C). Get_User_Mode reads the DEFAULTSYSTEMRESOLUTION key from OS2.INI, and returns the structure. The DEFAULTSYSTEMRESOLUTION key value gives the driver the display height, width, and color depth that the user has chosen.

Changing the resolution by way of the Settings notebook (namely, editing SYSTEM.INI) is accomplished by some keys in OS2.INI, so that the correct WINOS/2* drivers are installed. The following is a portion of S3.DSP (the dspinstl installation script for the S3 driver):

OS2.INI
PM_DISPLAYDRIVERS RESOLUTION_CHANGED 1
WIN_RES_640x480x16     WIN_RES_SET    WIN_RES_S3_0
WIN_RES_640x480x256    WIN_RES_SET    WIN_RES_S3_1
WIN_RES_640x480x65536  WIN_RES_SET    WIN_RES_S3_2
WIN_RES_800x600x256    WIN_RES_SET    WIN_RES_S3_3
WIN_RES_800x600x65536  WIN_RES_SET    WIN_RES_S3_4
WIN_RES_1024x768x256   WIN_RES_SET    WIN_RES_S3_5
WIN_RES_1024x768x65536 WIN_RES_SET    WIN_RES_S3_6
WIN_RES_1280x1024x256  WIN_RES_SET    WIN_RES_S3_7
WIN_RES_640x480x16777216  WIN_RES_SET    WIN_RES_S3_8
WIN_RES_S3_0   1  "system.ini boot sdisplay.drv swinvga.drv"
WIN_RES_S3_0   2  "system.ini boot display.drv vga.drv"
WIN_RES_S3_0   3  "system.ini boot fonts.fon vgasys.fon"
WIN_RES_S3_0   4  "system.ini boot fixedfon.fon vgafix.fon"
WIN_RES_S3_0   5  "system.ini boot oemfonts.fon vgaoem.fon"
WIN_RES_S3_0   6  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_0   7  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_0   8  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_0   9  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_0   10 "win.ini fonts \"MS** Sans Serif %ANYSTRING%\"
WIN_RES_S3_0   11 "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_0   12 "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_0   13 "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (VGA res)\"
WIN_RES_S3_0   14 "win.ini fonts \"Courier 10,12,15 (VGA res)\"
WIN_RES_S3_0   15 "win.ini fonts \"MS Serif 8,10,12,14,18,24 (VGA res)\"
WIN_RES_S3_0   16 "win.ini fonts \"Symbol 8,10,12,14,18,24 (VGA res)\"
WIN_RES_S3_0   17 "win.ini fonts \"Small Fonts (VGA res )\"

When the resolution is changed, the RESOLUTION_CHANGED key is added by the Settings notebook to OS2.INI. When the presentation display driver is restarted, this flag in OS2.INI is checked. If it is 1, OS/2 examines the WIN_RES_xxx_yyy_bbb key that corresponds to the value of DEFAULTSYSTEMRESOLUTION. Each of these strings in OS2.INI have a single key called WIN_RES_SET, with a value that gives the name of another string in OS2.INI. For example, WIN_RES_S3_0 has 17 keys. Key number 1 installs the line "sdisplay=swinvga.drv" in the "boot" section of SYSTEM.INI. OS/2 iterates through these keys, and edits the appropriate sections of SYSTEM. INI or WIN.INI, based on the value of each key.

Although the S3 driver does not look at the WINOS/2 .INI files, it is possible to use the WIN_RES_XXX values to create the entries for the WIN.INI or SYSTEM.INI that are queried by the WINOS/2 driver during its initialization. For example:

WIN_RES_S3_0   18 "system.ini s3 width 640"
WIN_RES_S3_0   19 "system.ini s3 height 480"
WIN_RES_S3_0   20 "system.ini s3 bpp 16"
WIN_RES_S3_0   20 "system.ini s3 dpi 96"

Adding the above entries to the fragment of S3.DSP (S3.DSP (Sample File for Installation and Configuration)) creates the following entry in SYSTEM.INI when 640 x 480 16-color mode is selected:

[s3]
width=640
height=480
bpp=16
dpi=96

When changing the dspinstl script, refer to S3.DSP (Sample File for Installation and Configuration) for information.

Obtaining Pointers to Video Memory

The PMDD.SYS module provides an IOCtl that takes a physical address and returns a linear address in the address space of the calling process. The presentation display driver calling routine can obtain a linear address from a physical address by calling GetVRAMPointer and is implemented in the S3 driver, which is found in the EDDEFLDB.C file. The following is the source code:

ULONG   flVRAMFirst = 0;
ULONG   pVRAM = 0xffffffff;
HFILE    hSVGA;
ULONG    ulAction;
CHAR     szSQ[32] = "\\DEV\\SINGLEQ$";

/***********************************************************************
*
* FUNCTION NAME = GetVRAMPointer
*
* DESCRIPTION = Uses an IOCtl to get a pointer to VRAM.
*
* INPUT       =
* OUTPUT      =
*
* RETURN-NORMAL =
* RETURN-ERROR  =
*
/***********************************************************************

ULONG GetVRAMPointer ( ULONG ulAddr , ULONG ulSize , PULONG
pulFirst , PULONG  ppVRAM )
{

SCRNTX   scrnTx;
SCRNRX   scrnRx;
BOOL     fAttach;
USHORT   usAction;

   if (DosOpen( szSQ , &hSVGA , &ulAction , 0, 0, 1, 0x00c0, 0))
   {
      goto   getvram_exit;
   }

   scrnTx.stx_Address = ulAddr;
   scrnTx.stx_Size = ulSize;
   scrnTx.stx_flFlag = 0;

   if ( !*pulFirst ) {
      fAttach = 0;
      usAction = 0x007e;
      *pulFirst = 1;
   }
   else
   {
      usAction = 0x007f;
      scrnTx.stx_Address = *ppVRAM;
      fAttach = 1;
   }

   if ( !DosDevIOCtl( hSVGA, 3, usAction, &scrnTx, sizeof(SCRNTX),
                     0, &scrnRx, sizeof(SCRNRX), 0) )
   {
      if ( fAttach == 0)
      {
         *ppVRAM = scrnRx.srx_ScrnPtr;
      }
   }

   DosClose( hSVGA );

getvram_exit:
   return( 1L );

}

The preceding code is designed to be called only once upon initialization of the driver and during the once-per-subsequent process that needs the linear address. The linear address returned on the first invocation of the GetVRAMPointer is valid only in the process that called GetVRAMPointer. GRE is multithreaded, so multiple processes use the S3 driver DLL. Each of these processes must have the linear address added to their page tables.

The PMDD.SYS module uses the DevHlp_VMALLOC to obtain the linear address to the physical memory. The pointer returned by the above code is usable anywhere in the driver except in MoveCursor32. (The pointer is unusable in MoveCursor32 because it is called at interrupt time. The pointer returned by the PMDD.SYS module exists only within the context of any process that has attached to it. The ring 0 code that calls MoveCursor32 does not have the pointer in its address space.)

To get a pointer to the aperture or memory-mapped registers that is usable from MoveCursor32, use DevHlp_PhysToGDTSel. (Refer to the XGA ring 0 driver for an example.) The function GetInstanceGDT in the xgaring0.asm file gives an example of using PhysToGDTSel to get a selector:offset pointer to the XGA driver's memory-mapped registers. If you need a flat pointer that can be used by MoveCursor32, implement your own ring 0 physical device driver that uses VMALLOC to create a linear mapping in the global, system linear space. To use it, you would have to guarantee that MoveCursor32 ran at ring 2, because ring 3 processes cannot access memory in the system space. (OS/2 reserves linear addresses above 512MB for use by operating system.)

It is also possible to get a flat pointer to the video aperture by way of an IOCtl to the SCREEN01.SYS file. The source code to the SCREEN01.SYS file is in the IBM Device Driver Source Kit for OS/2. The code to get the flat pointer is GetLinearAccess, in \DDK\SCR_S3\DEV\SCREENDD\SVGAROUT.ASM. The device name used to access SCREEN01.SYS is "SCREEN$". The IOCtl number is 0xb. The packet used for this IOCtl is the following:

GetLinear_Packet        STRUC
        PacketLength    DD     0H       ; total size of data packet
        PhysicalAddress DD     0H       ; Physical address of aperture
        ApertureSize    DD     0H       ; Size of aperture
        LinearAddress   DD     0H       ; Linear address of
aperture (
GetLinear_Packet        ENDS

The location of the video aperture is the PhysicalAddress. ApertureSize is the size (in bytes) of the video aperture. The LinearAddress field returns the linear address of the device. The address returned will be valid only within the context of the process that called the IOCtl. Also, it is not guaranteed that this IOCtl will return the same linear address for each process. Consequently, if you use this IOCtl, you must keep track of which pointer belongs to which process that is using the driver. (MATROX.C in the S3 driver has an example of how to do this).

The XGA ring 0 driver also has an IOCtl for obtaining a flat pointer to video memory. (In this case, the XGA 1MB aperture.) The function flat_ access, can be found in the file \..\..\XGASYS20\XGARING0.ASM. This function calls VMALLOC to obtain a flat pointer to the XGA aperture. The IOCtl function code for flat_access is 0x14. An example of how to call this particular IOCtl can be found in the S3 driver in MATROX.C.

For accessing flat data at ring 3, use selector 53 hex. (Accessing data at ring 3 is always true for code that is not running as part of the system at ring 0.) If a protection fault in the driver appears (Trap d), check the segment register being used to access the data. A segment register other than 0x53 could be part of the cause of the trap.

BitBlt

Unlike the SVGA driver, the S3 driver does not compile BitBlts onto the stack. Instead, eddb_BitBlt analyzes the raster operator and various source and destination options, and breaks the blt down into simpler cases. Each of these cases can take advantage of the graphics accelerator hardware. All BitBlts are broken down into combinations of the following four cases:

eddh_SrcDestBlt
eddh_PatDestBlt
eddh_DestOnlyBlt
Software BitBlt by way of eddf_MESS

Any BitBlt to the display ultimately ends up in one of the cases, and they are among the first pieces of code that must be modified when this driver is ported to a different chip set. These functions are located in the EDDHBBLT.ASM file. They handle the first three cases that an S3 chip can accelerate in hardware: a source, destination, and a raster operation; a pattern, destination, and a raster operation; or a destination-only raster operation.

eddh_SrcDestBlt (EDDHBBLT.ASM)

The eddh_SrcDestBlt function handles BitBlts involving some source with the display as a destination. It iterates through multiple clip rectangles, copying from the source to the display. The source may be either monochrome or a bit map of the same color depth as the display. The source may be a system-memory bit map, a portion of the visible display (for example, a scroll box), or in VRAM in the bit-map cache. The following is a pseudocode version of eddh_SrcDestBlt:

eddh_SrcDestBlt:
//this first part is mostly for the software drawing version of this
//routine, eddf_SrcDestBlt, although some of the hardware-drawing code
//eventually reads details from the various shadows of the pixmaps.

get the destination address
set pixmapA = destination bit map parameters
if expanding from monochrome source to color
        set pixmapC = source bit map parameters
else
        set pixmapB = source bit map parameters

do
        dec     CountOfClipRects
        if the blt intersects the clipping rectangle
                set pattern map
                set source map
                set destination map
                set pixel op
                if source has VRAM marker
                        //source is in VRAM
                        call SrcSpecialist
                else
                        //source is in system memory
                        call SoftSrcSpecialist
        get next clipping rectangle
while CountOfClipRects > 0

The eddh_SrcDestBlt function rolls through the clipping rectangles passed in from the GRE, and for each rectangle (if the blt intersects it), the eddh_SrcDestBlt function calls either SrcSpecialist, if the source is the display, or SoftSrcSpecialist, if the source is a memory bit map. SrcSpecialist takes the arguments set up by eddh_SrcDestBlt, and performs a VRAM-to-VRAM blt. The arguments it takes are passed in XGA-style in the Shadow8514Regs. The following is an example:

; use ebx as a pointer to the destination bit-map header
        mov     ebx, AIxfer.pbmhDest

; convert destination bit map address to x and y for 8514
hardware
        memregread      eax,dest_map
        mov     x_dst,ax
        ror     eax,16
        mov     y_dst,ax
        calc8514xy  [ebx].bit map_address
        add     x_dst,dx
        add     y_dst,ax

In this case, the X-Y coordinates of the destination are passed in the dest _map. Because the destination is the display, the bit map_addressfield of the AIxfer is set up with a dummy address in VRAM (this is why calc8514xymacro is used). Likewise, the source X-Y coordinates are passed in the source_map and AIxfer.pbmhSrc. AIxfer is a large union used throughout the driver to pass parameters from the high-level portions of the driver down to the various worker routines. For BitBlt, it is defined as type BITBLTPB , which is defined in the EDDHTYPE.H module. AIxfer holds pointers to the source and destination bit-map headers, the area of display affected by the blt (stored as rectangles), and parameters used by stretch blt.

Essentially, everything gets set up just as if the operation were going to occur on an XGA, and then the lowest level routines translate the parameters into a format appropriate to the display hardware.

SrcSpecialist handles operations such as scrolling, copying the contents of a window from one portion of the display to another, and copying bit maps out of the off-screen VRAM cache onto the display. It controls both color and monochrome bit maps, although this is not obvious from reading the code, because of idiosyncrasies of the 8514/A adapter. The 8514/A, and the S3 chip, treat monochrome data in VRAM as a rectangle of pels that occupies only a single plane. Only a few registers distinguish between blting a color or monochrome bit map from VRAM-to-VRAM. Most of these operations are hidden in TransferShadowRegisters. Many graphics accelerator chip sets do not support monochrome expansion from VRAM-to-VRAM at all. One of the most common uses of monochrome data is a mask for icons that are put on the display. When an icon is displayed on the desktop, the first operation that occurs is a monochrome mask that is ANDed onto the display. This operation creates a black hole on the desktop in the shape of the icon. The color portion of the icon bit map is then ORed in over this hole.

In the S3 driver, both the mask and the bit map will be in the cache. If your hardware does not support monochrome expansion from VRAM-to-VRAM, then make the following change to PixBltThroughClipsViaPhunk() in pixblt.c:

original, about line 306:

#ifdef BPP24
if ( !pHWMap->vertical_bmaps ) {
#endif
 if ( AIxfer.pbmhSrc->Info.Width <= VRAM_BM_CACHE_HOR_SIZE &&
      AIxfer.pbmhSrc->Info.Height <= VRAM_BM_CACHE_VERT_SIZE )
 #endif
 {
   if ( cache_bit map(AIxfer.pbmhSrc) )
   {
      // Caching the bit map may have corrupted some of the Color and Mix
      // Registers.  Restore them before calling PixBltThroughClips().
           TransferShadowRegisters( TSR_COLOUR_MIX );
           PixBltThroughClips();
           return;
   }
 }

modified, for no cached monochrome maps:

#ifdef BPP24
if ( !pHWMap->vertical_bmaps ) {
#endif
 if ( AIxfer.pbmhSrc->Info.Width <= VRAM_BM_CACHE_HOR_SIZE &&
      AIxfer.pbmhSrc->Info.Height <= VRAM_BM_CACHE_VERT_SIZE &&
      AIxfer.pbmhSrc->Info.BitCount != 1)
 #endif
 {
   if ( cache_bit map(AIxfer.pbmhSrc) )
   {
      // Caching the bit map may have corrupted some of the Color and Mix
      // Registers.  Restore them before calling PixBltThroughClips().
           TransferShadowRegisters( TSR_COLOUR_MIX );
           PixBltThroughClips();
           return;
   }
 }

Making this change to pixblt.c will prevent caching of monochrome bit maps and will be passed to SoftSrcSpecialist. SoftSrcSpecialist copies bit maps to the display in a variety of formats. When transferring color bit maps to the display, SoftSrcSpecialist uses the S3 pel-transfer register. At 8 bits-per-pel, it uses a memory-mapped version of the register for better performance. At 16 bits-per-pel, it swaps the bytes in each pel prior to writing the pel to the pel-transfer register. This is unnecessary, because the order of bytes written via the pixel transfer register can be swapped in hardware by simply setting a bit in one of the S3 registers. Thus, it is the extra work the driver is doing when it swaps bytes that is unnecessary. For further information, see 16-Bit-Per-Pel Support.For monochrome data at 8-or 16-bits-per-pel, SoftSrcSpecialist copies the monochrome bit map through the pel transfer register. This is easier on the S3 chip than on the 8514/A as the S3 chip requires no special alignment of monochrome data. The drawing engine of the 86C80X and S3 Vision chips does not directly support packed 24 bit-color. Therefore, when expanding a monochrome bit map at 24-bits-per-pel, it calls Copy24MonoMixToVRAM, in HWACCESS.ASM. This routine processes monochrome expansion to 24-bit-color expansion in software by way of the S3 chip's aperture into video memory.

eddh_PatDestBlt (EDDHBBLT.ASM)

The eddh_PatDestBlt function handles a BitBlt of a pattern onto the display. Structurally, it is very similar to eddh_SrcDestBlt in that it enumerates through the clipping regions passed to it, and for each rectangle that the operation intersects, it calls one of two routines: PatDestBltColor or PatDestBlt1To8. PatDestBltColor performs patblts where the pattern is taken from a color bit map. The most common color pattern blt is the 2x2 dither used to paint the desktop in 8-bit-per-pel modes. Even portions of windows that appear to be white are, in fact, dithered. The pattern bit map may need to be rotated to align with the destination. If this is the case, PatDestBltColor builds a rotated copy of the pattern in the Phunk. When built, the pattern is copied into off-screen VRAM. The pattern is then tiled across the destination with successive hardware blts. In the case of a pattern copy, the first row of the destination is built, and then used as the source for the successive blts. This operation could be performed more efficiently on S3 chips by detecting patterns that are up to 2x2, 4x4, or 8x8, putting them in off-screen VRAM, and performing blts with them with a single S3 pattern-blt command.

PatDestBlt1To8 handles monochrome patterns. It calls either PattBlt24, for the 24 bpp case, PattBlt1x8, or PattBlt8x8. PattBlt1x8 is a special case routine that handles 1 high by 8 wide monochrome patterns. This special case routine is one of the few pattern blts that the original 8514/A could perform in a single operation. The S3 driver performs this operation as a monochrome expansion through the pel port. If your hardware does not handle 1x8 monochrome blts, do not implement this special case routine.

PattBlt8x8 controls the more general case of 8x8 monochrome pattern. The S3 chip can perform this as a single-blt operation. For the S3 driver, the PattBlt8x8 aligns the pattern, copies it into off-screen VRAM, and then issues a command to the S3 chip to do the pattern blt.

Because the various S3 chips do not directly support packed 24-bit-per-pel modes, PattBlt24 processes more than PattBlt8x8. It expands the pattern into off-screen memory and then repeatedly blts this off-screen pattern onto the display using a VRAM-to-VRAM copy blt. After it has filled a single row of the pattern, it then repeatedly blts that entire row, doubling the height of the source each time. For example, if the pattern is 1 pel high by 8 wide, PattBlt24 first builds a row of the pattern 1 pel high. It then copies that row to the second row. Next, it copies both the first and second rows to the third and fourth rows. It then copies the first four rows to the next four. It then continues to double until it has filled the destination rectangle.

eddh_DestOnlyBlt (EDDHBBLT.ASM)

This routine follows the pattern set by eddh_SrcDestBlt and eddh_PatDestBlt. Destination invert raster operations are translated into source invert raster operations in EDDNBBLT.C. This translation might have to be changed on your hardware. Almost all hardware blt engines support inversion of the destination, but many won't invert the source, if the source and destination are identical. The raster operation should be "not destination." Instead, it is mapped to "not source" and the source and destination are created to be the same.

BitBlt - The Top Level Routines

There is a certain amount of interaction and hardware dependence in the bit-map cache and the three blt routines in EDDHBBLT.ASM. This interaction is controlled by the various routines in BitBlt.

eddb_BitBlt (EDDNBBLT.C)

This is the entry point in the driver for BitBlt. It handles the device-independent portions of BitBlt. It validates parameters, transforms coordinates, and handles error logging. When it has completed some setup, eddb_BitBlt looks at the raster operation and splits into one of several cases, such as a three-way raster operation, a raster operation involving the source and destination, a raster operation involving a pattern and the destination, or a raster operation involving the destination-only. The following is a simplified pseudocode for eddb_BitBlt:

eddb_BitBlt()
// check for stretch blt
if not COM_DEVICE
    if ArgCount == 4 and src_coords != dst_coords
        stretch_return_code = IsValidStretchRequest()

        //can the driver handle the stretch blt?
        if stretch_return_code != OK
            Perform call-back to the engine to do stretchblt

// at this point, eddb_BitBlt checks for a valid rop, and checks to see
// if the request is for correlation or drawing, etc.

if COM_CORRELATE
    if not eddg_ClipPickWindow()        // get correlation rectangles
        goto BITBLT_EXIT                //error!

//if the PM desktop is in the background and the destination is the
//display, disable drawing
if fXGADead and Destination is display
    COMMANDBITS = COMMANDBITS and not(COM_DRAW)

//convert target coordinates to device coordinates
TransformCoordinates()

//Examine the rop and determine which case it falls into

if source_is_required(rop)
    goto SOURCE_3WAY_COMMON
else
    goto PAT_DSTONLY_COMMON

SOURCE_3WAY_COMMON:
//handle all blts with a source, including blts involving source,
//pattern, and destination
if not(Stretching) //no need to setup if this is a stretchblt
    if source is a bit map
        convert handle to pointer and assign to AIxfer.pbmhSrc
    else //source is a device context handle
        convert handle to pointer to DC and assign to pdcSrc
        //get a pointer to the bit map defining the device
        AIxfer.pbmhsrc = pdcSrc->DCISelListEntry

Build rectangles defining the source and destination

//correlation involves determining if a pick is inside the area
//of the drawing operation.
if COM_CORRELATE
        perform correlation

//see if we need to draw
if not(COM_DRAW)
        goto BITBLT_EXIT

Calculate BitBlt dimensions and store in AIxfer.rcsTrg and
AIxfer.rcsSrc

if software_cursor
    calculate exclusion rectangle
    eddm_ExcludeCursor()

if pattern(rop)
    goto THREE_WAY_BLT    //there is a pattern involved
else
    goto SRC_DST_BLT      //no pattern involved

SRC_DST_BLT:
if palette mapping is required
    create palette mapping
    if the source is cached
        evict source bit map from the cache

//is the destination the display?
if AIxfer.pbmhDest == DirectListEntry
    SetDrawModeHard    //drawing to the display
else
    SetDrawModeSoft    //drawing to a bit map

//check for monochrome expansion
if mono(source) and not(mono(destination))
    //source is 1 bpp and destination isn't, so this is a monochrome
    //expansion blt

    setup foreground and background colors in Shadow8514Regs
    if destination is cached
        evict destination from the cache

else
    //both source and destination are in the same format
    //or source is color and destination is monochrome
    if source and destination are the same bit map
        if source and destination overlap
            determine direction in which to blt
            note that clipping rectangles need to be re-ordered
            store blt directions in pixelop
    else if source is color and destination is monochrome
        AIxfer.pbmhSrc = convert_colour_to_mono

Set source and mix in Shadow8514Regs
TransferShadowRegisters
SPad.BltFunction = either eddf_SrcDestBlt or eddh_SrcDestBlt

if src and dest are system memory or src and dest are
display
    PixBltThroughClips
else
    PixBltThroughClipsViaPhunk

if destination is cached
    evict destination from the cache

goto BITBLT_EXIT

THREE_WAY_BLT:
//source, pattern, and destination are involved in this rop

Save a copy of AIxfer.pbmhDest
Save a copy of AIxfer.pbmhSrc
Setup the pattern
Save a copy of the source & target rectangles

if source bit map == destination bit map
    if source and destination overlap
        determine order for clipping regions
        determine blt directions
else if source is monochrome and destination is color
    get foreground and background colors for mono bit map
else if destination is monochrome and source is color
    source = convert_colour_to_mono(source)

if bit 7 of the rop is set
    rop = 0ffh - rop
    calculate starting blt instruction
    add trailing "not" to last blt instruction
else
    get pointer to starting instruction

if Blt3WayBuffer[rop].CreateNew == TRUE
    allocate a temporary bit map to hold intermediate results

instruction_count = Blt3WayBuffer[rop].Size

while (instruction_count > 0)
    if Instruction->Destination == BT_DEST
        fThreeWayBlt = False    //clip the destination
        set up destination map in AIxfer
    else if Instruction->Destination == BT_WORK
        fThreeWayBlt = True     //don't clip the destination
        set up work map in AIxfer

    if Instruction->Source == BT_PAT
        set up the pattern in AIxfer
        if destination is the display
            SetDrawModeHard()
        else
            SetDrawModeSoft()
        else
         if Instruction->Source == BT_DEST
             set AIxfer source to destination bit map
        else if Instruction->Source == BT_WORK
             set AIxfer source to temporary bit map
        else if Instruction->Source == BT_SRC
             set AIxfer source to actual source bit map
        else
             no source, so do nothing!

    if destination is the display
        SetDrawModeHard()
    else
        SetDrawModeSoft() 

    set up palette mapping if needed

    set up foreground and background colors and mixes based on if
    the blt is expanding from monochrome to color.

    TransferShadowRegisters()

    if src and dest are system memory or src and dest are display
        PixBltThroughClips
    else
        PixBltThroughClipsViaPhunk

    Instruction = next(Instruction)
    instruction_count = instruction_count - 1

fThreeWayWorkBlt = False

if the original rop >= 0x80
    remove trailing "not" from last Instruction

goto BITBLT_EXIT

PAT_DSTONLY_COMMON:
 //more of the same....

PAT_DST_BLT:
//not shown lest the pseudocode become as complicated as the
original

DST_ONLY_BLT:
//ditto

BITBLT_EXIT:
//end of BitBlt

Blting from source to destination involves several cases. First, there is the possibility that a palette mapping will be needed. (Palette mapping is necessary when the source and destination use differing palettes. Refer to eddb_CreatePalMapping in the EDDBPHNK.C file for details.) If the source is monochrome and the destination is not, the color_0 and Shadow8514Regs.color_1 fields must be set up. Finally, if both the source and destination are in system memory, or both are in VRAM, PixBltThroughClips is called to perform the blt. Otherwise, PixBltThroughClipsViaPhunk is called. The Phunk is always used when copying to or from system memory to VRAM. Using bus mastering via the Phunk was an optimization in the XGA driver, because copying to or from the Phunk to VRAM was fast. For the S3 driver, the Phunk is still useful if the source must be changed to a different format or if it must be stretched.

THREE_WAY_BLT involves a raster operation, a pattern, a source, and a destination. The key to understanding three-way blts is that they are interpreted. The blt interpreter breaks the raster operation down into a series of two-way blt operations that can be performed in either hardware or software. The following are examples of data structures:

// eddbtypt.h

typedef struct
{
    BYTE    Source;
    BYTE    Destination;
    BYTE    Mix;
    SHORT   BltFunction ;
} BltInst;
typedef BltInst NEAR * pBltInst;

typedef struct
{
    BYTE       CreateNew;
    BYTE       Size;
    pBltInst   Instruction;
} BltBuffEntry;
typedef BltBuffEntry NEAR * pBltBuffEntry;

In the EDDBDATA.C file, there is a large array called Blt3WayInsts, which is an array of BltInst structures. Each BltInst in the array is one step in the construction of a three-way BitBlt. BltInst.Source lists the source for this stage of the blt. Likewise, BltInst.Destination is the destination. BltInst.Source can have the following values:

BT_DEST 0 Use the destination
BT_WORK 1 Use the temporary work buffer
BT_SRC 2 Use the source
BT_PAT 3 Use the pattern
BT_NONE 4 Use none or the same as last time

BltInst.Destination can have as a value of either BT_DEST, BT_WORK, or BT_ SRC. (BT_SRC implies no destination or specifies the use of the same destination as last time.)

The Mix field holds the hardware mix to use (XOR, OR, AND, COPY, and so forth), and the BltFunction field holds an index into the pDrawFunctionsTable. BltFunction selects which of three types of blt this instruction is to perform, such as a blt with a source and destination, a pattern blt, or a destination-only blt. For example, the following entry uses the source as the source for this step, uses the destination as the destination, uses a raster operation of (not source) and (not destination), and uses either eddf_SrcDestBlt or eddh_SrcDestBlt to perform this step. In effect, each entry in Blt3WayInsts is an instruction, with BltFunction and Mix forming the operation code. BT_SRC, BT_DEST, HWMIX_NOTSOURCE_AND_NOTDEST,

index_SrcDestBlt, Blt3WayBuffer is composed of structures of type BltBuffEntry. The CreateNew field of this structure is a Boolean value. When it is set to TRUE, a temporary buffer must be created for use during the blt. The Size field is the number of instructions it will take to perform this blt, (each instruction is a single entry in Blt3WayInsts), and Instruction field is a pointer to the first entry in Blt3WayInsts used for this blt.

Only raster operations from 0 to 7f hex are encoded this way. If the raster operation for the blt is greater than 80 hex, it is changed to the corresponding raster operation that is less than 80 hex, and a "not" operation is added to the final instruction for the blt. Performing the blt sets up the source and destination bit maps in AIxfer, and then calls either PixBltThroughClipsViaPhunk or PixBltThroughClips. When this has been completed, the next instruction in the blt is taken, and the cycle is repeated until no instructions remain.

The fThreeWayWorkBlt flag is true when the temporary bit map is being used as the destination for the blt. This flag exists so the PixBlt routines will not clip the work bit map against any clipping rectangles. It is necessary to avoid clipping the work bit map because it is an intermediate result, while the clipping rectangles refer to the destination bit map.

The pseudocode for pattern blts and destination-only blts generally follows a similar pattern to blts that involve a source and destination.

PixBltThroughClipsViaPhunk (PIXBLT.C) and PixBltThroughClips (PIXBLT.C)

PixBltThroughClipsViaPhunk and PixBltThroughClips have similar functions. Both enumerate through the clipping rectangles passed in from the graphics engine (GRE), batch up a number of them, and then call the blt function specified in SPad.BltFunction with those rectangles. This process is repeated until all clipping rectangles have been exhausted. The two differ in that PixBltThroughClips is designed to deal with bit maps that are either both in VRAM, or both in system memory. PixBltThroughClipsViaPhunk is used when the source is in system memory and the destination is in video memory. PixBltThroughClipsViaPhunk is also used when the source is in video memory and the destination is system memory. If the source and destination are both in system memory, only PixBltThroughClips is called. Likewise, if the source and destination of the blt are in video memory, only PixBltThroughClips needs to be called. There is one special case. In PixbltThroughClipsViaPhunk, if the source is in system memory and the destination is video memory, it is possible that the source might fit into bit map cache. If the source will fit, PixBltThroughClipsViaPhunk copies the source into the cache. Once the source has been cached, both the source and the destination are in video memory. This means that PixBltThroughClipsViaPhunk can call PixBltThroughClips to perform the blt.

If the source bit map will not fit in the cache, PixBltThroughClipsViaPhunk copies the source bit map into the Phunk in 64KB chunks. It calls CopyChunkToPhunkForPixBltSource to put the source into the Phunk. CopyChunkToPhunkForPixBltSource uses eddf_MESS to copy a portion of the source bit map into the Phunk. For the XGA, this allowed the bit map to be blted to VRAM by way of the bus-master. For the S3 chip, the Phunk is just another 64KB of system memory. PixBltThroughClipsViaPhunk is used when the source is in system memory and the destination is in video memory. PixBltThroughClipsViaPhunk is also used when the source is in video memory and the destination is system memory. If the source and destination are both in system memory, only PixBltThroughClips is called. Likewise, if the source and destination of the blt are in video memory, only PixBltThroughClips needs to be called. There is one special case. In PixbltThroughClipsViaPhunk, if the source is in system memory and the destination is video memory, it is possible that the source might fit into bit map cache. If the source will fit, PixBltThroughClipsViaPhunk copies the source into the cache. Once the source has been cached, both the source and the destination are in video memory. This means that PixBltThroughClipsViaPhunk can call PixBltThroughClips to perform the blt.

eddb_BltThroughClips and eddb_DrawThroughClips (EDDNBBLT.C)

These two routines are in EDDNBBLT.C, but they are not actually a part of BitBlt. Instead, they are called by other parts of the driver that need to enumerate through the clip regions, while performing a drawing operation. Both take as an argument a pointer to a function. This function is one of the worker functions in the pDrawFunctions table. These are simplified, mini-versions of PixBltThroughClips.

Text Output

The S3 presentation driver supports two different types of character output. One type is GRECharStringPos, which is implemented in the S3 driver by eddt_CharStringPos, in the EDDNGCHS.C file. The other type supports AVIO (Advanced Video Input/Output) text, which is used in OS/2 and DOS window sessions.

The eddt_CharStringPos function uses two strategies for displaying text to the screen. For most strings, it will cache glyphs from the font. While cached, it will draw text by expanding the monochrome bit maps in the font cache to the display. There are, however, some cases where it cannot do this. One case is when the character to be cached is higher than the height of the font cache. Very large fonts will overflow the font cache, as the font cache is typically 127 scan lines tall in the S3 driver. Another reason it may be impossible to cache the glyphs is that the driver may be in 24-bit-per-pel mode, and the requested text is not some shade of gray. A third possibility is that the requested raster operation may be one that has problems on certain S3 chips. A fourth possibility is that only a single character in a font appears, making it undesirable to put it in the font cache. (Doing so might evict a heavily used font so that a single character may be cached.) In any of these cases, eddt_CharStringPos draws characters using BitBlt.

The following is an example of pseudocode for eddt_CharStringPos:

eddt_CharStringPos:
pFocaFont = Current Font
Check that request is valid
if simulation is required
    call back to engine to perform simulation

if display is disabled
    remove COM_DRAW bit from command flags
    if correlation required
        get correlation rectangles

if opaque rectangle is present
    if coordinates must be transformed
        set up opaque rect coordinates for transformation
    else
        get opaque rectangle coordinates

Get starting coordinates for the string

Transform any coordinates that need to be transformed to
device space

if increment vector is present
    convert increment vectors to device space

if current font has only a single character
    BltOneChar = TRUE
    Set up character spacing
    set PerCharDefs to a pointer to the character
    get character width and height
    get character A, B, C space
    if increment vector is present
        add in increment vector
    else
        add in character width
else
    BltOneChar = FALSE
    if character is too tall to fit in the cache
        CharTooBig = TRUE

    if font cache is disabled
        CharTooBig = TRUE

    if requested raster operation is defective on the S3 chip
        CharTooBig = TRUE

    if 24 bits per pixel
        get foreground and background colors
        if NOTGREY(foreground) and NOTGREY(background)
            CharTooBig = TRUE

    Compute starting position for the string

    if cursor is software
        Disable cursor

    if CharTooBig == FALSE
        if font is not in the font cache
            eddt_LocateCachedFont()    //Get a font cache entry

        if eddh_update_cache() == FALSE    //unable to fit font in cache
            eddt_LocateCachedFont()    //Get a font cache entry
            if eddh_update_cache() == FALSE
                bail out of string operation, the font is too big

    reenable cursor
    do one last bit of boundary setup (eddt_GetTextBox)

if there is an increment vector
    add in increment to x position

if BltOneChar == TRUE
    adjust starting position to be compatible with BitBlt
else
    adjust starting position to be compatible with string blt

if opaqueing rectangle required
    build opaque rectangle

if clipping rectangle
    clip bounding rectangle against clip rectangle
    if CharTooBig == FALSE and string completely clipped
        goto UPDATE_POS

    clip opaque rect against clipping rectangle

    if opaque rect entirely clipped
        disable COM_DRAW bit in command flags
else
    handle multiple clip rects if necessary

if we need to accumulate bounds
     accumulate bounds

if we need to perform correlation
    perform correlation

if COM_DRAW is not set in command flags
    goto CHARSTRPOS_OK_EXIT

if destination is the display
    SetDrawModeHard
else
    SetDrawModeSoft

if software cursor
    exclude cursor

setup foreground and background colors

if opaque rectangle must be drawn
    set pixelop
    set mix, blt source, no color compare
    TransferShadowRegisters()
    //draw the opaque rect via pattern blt
    eddb_DrawThroughClips(index_PatDestBlt)

if number of characters to draw is 0
    goto CHARSTRPOS_OK_EXIT

//special case #1 -- we have a font with a single character
if BltOneChar == TRUE
    calculate size of character in bytes
    copy character into the Phunk, putting it in row major order
    Create a bit map header for the character
    set up pixelop, mix, source, colors etc.
    TransferShadowRegisters()
    Setup AIxfer for a BitBlt
    while count of characters > 0
        fix up the character for BitBlt
        eddb_DrawThroughClips(index_SrcDestBlt)
        update position of next character
        decrement count of characters
    goto UPDATE_POS

//special case #2 -- uncacheable font
if CharTooBig == TRUE
    set up colors, mixes, source, etc.
    TransferShadowRegisters()
    while count of characters > 0
        get character width and height
        set pCharDefn to point to the character bit map
        get A, B, and C space
        copy the character into the Phunk, putting it in row major order
        setup a bit map header for the character bit map
        setup AIxfer for a BitBlt
        fixup source and destination values so they are non-negative
        eddb_DrawThroughClips(index_SrcDestBlt)
        advance to next character position
        decrement count of characters
     goto UPDATE_POS

//finally we get to draw text through the code in eddhgchs.asm
set up pointer to the font in AIxfer
store a pointer to the code points and position of first char in AIxfer
set up mixes, source, colors
TransferShadowRegisters()

if position vector present
     //draw them one at a time -- we have a position vector
     set number of characters in AIxfer to 1
     while count of characters > 0
        store increment from vector in AIxfer.bCharWidth
        eddb_DrawThroughClips(index_DrawText)
        update position of next character from position vector
        point to next character
        decrement count of characters
else
    Set count of character in AIxfer to number of characters in string
    eddb_DrawThroughClips(index_DrawText)

UPDATE_POS:
update position if required

if destination cached
    evict cached bit map

CHARSTRPOS_OK_EXIT:
reenable cursor

In eddt_CharStringPos and in the character-caching code, notice the following loop:

for (i = 0; i < CharHeight; i++)
{
    source_index = i;
    for (j = CharBytesPerRow; j--; )
    {
        *pCharBuffer++ = pCharDefn[source_index];
        source_index += CharHeight;
    }
}

This loop is a difficult copy operation because presentation stores the bit maps for glyphs in column major order, rather than in row-major order. That is, all of the bytes that make up the first column of a character are stored one after another, followed by all of the bytes of the second column, and so forth. The bits in each byte of monochrome data are in Motorola** order; that is, bit 7 is the left-most bit, and bit 0 is the right-most bit. Also, after the inner loop rearranges the glyph, the byte ordering is in Motorola order. For other devices, it is possible that either or both the bit and byte ordering may have to be altered.

The eddt_CharStringPos function caches characters in off-screen VRAM. The first portion of this involves determining if the font has already been cached. The eddt_LocateCachedFont() data field, in EDDNCACH.C, searches the array of FontCacheInfo structures. If it finds a matching font, it returns. If not, it evicts a font from the cache, and sets the usCachedFontIndex field in the passed-in parameter in the FONTDETAILS structure. The FONTDETAILS structure is defined in the EDDTYPET.H file, and it stores a pointer to the font and a cache slot, among other things. The S3 driver supports the caching of up to 8 fonts; therefore, the array of pointers to FontCacheInfo structures (pFontCacheInfo) consists of 8 entries.

When an entry has been found for the font, it is necessary to actually cache the characters in the string. This caching is handled by eddh_update_cache, which is in the EDDHGCHS.ASM file. The eddh_update_cache loops through the string, and for each character that is not already cached, it calls eddt_CacheCharacter (in EDDNCACH.C). The code in eddt_CacheCharacter is similar to the code in eddt_CharStringPos that uses BitBlt to display a single character. First, it attempts to find a position in the cache for the character. The 8514/A and the S3 driver stores a single font in a single plane of VRAM. In 16-bit-per-pel modes, the S3 driver allows a font to wrap to another plane. Because the 8514/A interacts with bit maps as rectangles, eddt_CacheCharacter calculates an X-Y coordinate for the character in the font cache, and calls Cache8514Char (in EDDHGCHS.ASM) to copy the glyph into off-screen memory.

In the cases where it is not using BitBlt, eddt_CharStringPos uses either eddh_DrawText (if the destination is the display), or eddf_DrawText (if the destination is not the display). The eddh_DrawText function loops through an array of clipping rectangles, drawing the string (or portion thereof) into each clipping rectangle. The complexity of eddh_DrawText results from the fact that it does not use the S3 chip hardware-scissoring capabilities to clip the text. Instead, if part of a character is clipped, it calculates offsets into the character, and into the destination so that only the portions of the character that are inside the clipping rectangle are actually drawn. When all of that has been set up, the character is BitBlted by way of a hardware monochrome expansion blt from the character cache to the display.

Advanced Video Input/Output (AVIO) Text

AVIO text is used by GRE to draw characters in DOS or OS/2 window sessions. The S3 driver implements this functionality with eddv_CharStr (in EDDVCSTR. C), eddv_CharRect (in EDDVCREC.C), eddv_ScrollRect (in EDDVSREC.C), and eddv_UpdateCursor (in EDDVUPDC.C). The four basic functions that comprise AVIO text are: drawing a string of characters at a location in the window, drawing a rectangle of characters, scrolling a rectangle, and displaying a text cursor in the window.

In addition to the cursor, the fundamental operation employed in AVIO text is drawing a rectangle of characters. The GRE passes the AVIO functions to a string of characters and attributes, and a rectangle that describes the shape that those characters are to be drawn into. For instance, the string "The Quick Brown Fox" could be rendered as:

The Quick Brown Fox

or:

T
h
e
Q
u
i
c
k
B
r
o
w
n
F
o
x

or even:

The Qui
ck Brow
n Fox

This approach allows portions of the DOS or OS/2 window session to be refreshed efficiently, even if they are partially occluded by other windows. The presentation display driver has to support both horizontal strings of characters and rectangular blocks of characters. The S3 driver calls the same functions to actually render the characters in either case. These functions are CGATextBlock and MFITextBlock. The following is the calling hierarchy for eddv_CharString and eddv_CharRect:

eddv_CharString (eddvcstr.c)
        CheckAVIOFontsCached (eddvsubr.c)
    if cell_size == 2
        CGAText
     else
        MFIText

eddv_CharRect (eddvcrec.c)
        CheckAVIOFontsCached (eddvsubr.c)
    if cell_size == 2
        CGAText
     else
        MFIText

CGAText (eddvsubr.c)
        update_cache_char_rect2 (eddhavio.asm)
        CGATextBlock (eddhavio.asm)

MFIText (eddvsubr.c)
        update_cache_char_rect4 (eddhavio.asm)
        MFITextBlock (eddhavio.asm)

There are similarities in the actual source for eddv_CharRect and eddv_ CharString. The eddv_CharString function is a special case of eddv_CharRect. AVIO fonts are cached in off-screen VRAM in the S3 driver. CheckAVIOFontsCached, update_cache_char_rect2, and update_cache_char_rect4 call the same code in the EDDNCACH.C as does eddt_CharStringPos. When fonts are cached correctly for the presentation desktop text, they will also be cached correctly for AVIO.

CGAText and MFIText ensures the font is cached, and then makes a call to either CGATextBlock or MFITextBlock. These latter two routines perform the work. CGATextBlock handles text that is formatted as two-byte character, attribute pairs, which is exactly how color text is stored on a real CGA. The CGA stored text is one-byte of attribute (foreground and background color) and a one-byte character. (MFI, Main Frame Interactive, is used to support main frame terminal emulation.) MFI Text uses four-byte characters, and includes extended attributes such as underlined characters, and inverse video. Otherwise, MFITextBlock and CGATextBlock are similar to each other.

CGATextBlock takes two arguments, a pointer to a FONTDETAILS structure, and a pointer to an AVIOINFO structure, which is taken from the device context. In addition, CGATextBlock takes parameters from the AIxfer block, and even a few things from the Scatch Pad (Spad). The AVIOINFO structure holds two useful pieces of information: the X and Y starting coordinates of the window and the size of each character cell. Many of the other parameters are given in units of character cells. The AIxfer block holds several useful pieces of information. AIxfer.bRow and AIxfer.bColumn are the starting row and column for this operation. Multiplying them by pAvioInfo.bCellHeight and pAvioInfo.bCellWidth and adding in pAvioInfo.sXoffset and pAvioInfo. sYoffset yields the starting location for the first character on the screen.AIxfer.bDown is the height of the character rectangle, in characters. AIxfer.bAcross is the width of the character rectangle in units of character cells.

Changing CGATextBlock results in the upper and lower 16-bits of various registers that are used to hold different loop counters. The code uses a macro called swap when it needs to access something in the high-order 16-bits of a register. As new code is added, a register can easily be locked up if it is set up and used a distance away from the new code. The high 16 -bits of the "ecx" registers hold the count of clipping rectangles that must be iterated through. Register cl holds AIxfer.bDown. Register ch holds AIxfer.bAcross. Later in the code, the "ecx" register is pushed on the stack, and cx holds the current character and attribute pair. The "ebx" register holds a pointer to the string of attributes and characters, and is also pushed onto the stack and then holds the code-point (which is an index into the cache structures so the characters' location in the VRAM cache can be obtained). The "eax" register holds the base address of the glyph definition, and after being pushed, as a repeat count. (CGATextBlock detects multiple copies of the same character, and performs less setup on subsequent renderings of the character. Much of what is drawn into an AVIO window is space characters.) The usage of registers changes depending on where you are located in the code. Locating these changes is probably the single most difficult aspect of changing this code. The actual operation being performed is a monochrome-expansion BitBlt. Often, it is difficult to determine which registers are actually available for use.

CGATextBlock performs clipping a bit differently. Unlike eddh_DrawText, it uses the 8514/A scissoring registers. (In fact, few of the hardware-dependent routines in the S3 driver use the hardware scissoring registers.)

AVIOScroll, also in EDDHAVIO.ASM, is used as part of the process of scrolling an AVIO window. It is a simple screen-to-screen hardware BitBlt. AVIOCursor, in the same file, draws a solid block on the display using whatever colors and raster operations have been set up in the TransferShadowRegisters structure.

Line Drawing

The OS/2 Presentation Device Driver Reference mentions three types of line drawings that must be supported by the driver: polylines, poly-short lines, and draw lines in path. Of these, poly-short lines are no longer used by the GRE when rendering curves. Consequently, there is little need to pay attention to short lines.

Draw lines in path is implemented in the EDDLDRAW.C file, in the eddl_DrawLinesInPath function. Poly lines are implemented in the EDDLPOLY.C file, in the eddl_Polyline function. Neither of these functions has any real dependency on the hardware. In fact, both perform only a certain amount of setup, and then calls to eddl_PolyLineWork (in EDDLPOLY.C), which performs even more setup. The eddl_PolyLineWork function performs coordinate transformations, if they are required. It accumulates bounds and performs correlation, if these are required. It sets the Shadow8514Regs with the colors and mixes required for the polyline operation. It enumerates through the clipping rectangles, if required. It does not draw lines. Instead, it calls (*(*pDrawFunctions)[index_PMLINES]) (), which is really a call to either eddf_PMLINES or eddh_PMLINES.

The eddh_PMLINES function takes an array of lines and an array of clipping rectangles, and renders the lines to the display clipped to the clipping rectangles. The eddh_PMLINES function uses the hardware to render lines. The hardware clips each line against the clipping rectangles in the software. During this clipping, it calculates the octant in which it falls, and the Bresenham error terms for the line. These error-term calculations are geared toward the 8514/A, although with minor modifications they could be adapted to other devices. The octant in which the line falls is calculated as follows:

OCT_DX_BIT Delta x is negative (x is decreasing)
OCT_DY_BIT Delta y is negative (y is decreasing)
OCT_DZ_BIT Line is y major

Aside from the octant and error-term calculations, eddh_PMLINES must also determine whether to draw the first pel of lines. This occurs by checking AIxfer.usCallType after entirely drawing the first line. If this is equal to POLYLINE_CALL, then the first pel of all subsequent lines must be omitted.

The following code example shows how to draw styled lines when using an S3 chip set. This code is located in eddh_PMLINES.

ifdef HARD_DRAW
ifdef _8514
; Is there a pattern which we need to set up?
; If so, the 8514/A can't handle it.  We must use the
Bresenham line
; drawing simulation code.
        cmp     _AIxfer.usPatType, LINETYPE_SOLID
        jnz     softdraw

ifdef   BPP24
        test    [_DDT],USE_24BPP
        jnz     softdraw
endif
endif
endif

.
.
.

softdraw:
; Right, this spells T.r.o.u.b.l.e.
; We are now going to call a 'C' function in the driver which emulates
; the hardware line draw. To make things simple we will save some
; important registers here.
; These registers should be saved: xga, edi, esi, ecx, ebx
; For safety we save all the necessary registers
        pushxga
        push    edi
        push    esi
        push    ebx
        push    ecx

        call    _eddl_BresenhamOne

        pop     ecx
        pop     ebx
        pop     esi
        pop     edi
        popxga
        jmp     vloopend

The eddl_BresenhamOne function is located in EDDLPOLY.C. It is a software line-rendering routine that draws to the hardware calling Draw8514_Pel, in EDDHLINE.ASM. Draw8514_Pel plots a single pel at the given X-Y coordinate using the mix, color, and so forth, set up by eddl_BresenhamOne. For chips that support an aperture into the frame buffer, it is almost always more efficient to draw the pel through the frame buffer than to use the drawing engine. Changing one of the conditional jumps in the listing above to a "jmp softdraw," forces all lines to be drawn in software, which is useful for chips that do not support hardware-line drawing.

The PMChart and Pulse programs in the OS/2 Productivity folder provide excellent test cases for the various types of line-drawing that the driver is required to do. In addition, the highlighted icons on the presentation desktop are a very good test for styled lines.

Scanlines

Scanlines are processed by eddh_PMSCANLINE, in EDDHSCAN.ASM, which is the work routine for eddl_PolyScanLine, in EDDSPOLY.C. Scanlines are used internally by the fill code in GRE. They are used any time an area needs to be filled. The eddh_PMSCANLINE function is given an array of scanline endpoints and an array of clipping rectangles. It takes the array of endpoints and draws horizontal lines between them with the given fill pattern.

There are three cases for fills: a monochrome-dithered pattern, a color (2x2) dithered pattern, and a user-supplied monochrome bit map that is used as a pattern. The monochrome and color-dithered cases are handled by the main body of eddh_PMSCANLINE. In either case, a monochrome pattern of alternating 1's and 0's is aligned to the destination. In the case of a monochrome pattern, the foreground and background colors are set up in TransferShadowRegisters. In the case of a color-dithered pattern, the colors are pulled from the pattern bit map which is passed in. Note that color-dithered patterns only exist at 8 bits-per-pel and less. There is no reason to dither at 16 bits-per-pel and more. Bit-map patterns are handled by the routine UserBmapScan, which is also in EDDHSCAN.ASM. This takes an arbitrary bit map and uses it as the pattern source for the fill.

Image Data

The image data function is an obscure driver entry point. It takes a single scanline of monochrome data, and expands it to the display using the current color and mix. This is supported by eddb_ImageData, in EDDBIMAG.C, which in turn calls eddh_PMImageData, in EDDHIMAG.ASM. This function takes a supplied monochrome bit map, and blt's a single scanline to the display. This function is rarely used. (The OS/2 Tune Editor makes extensive use of image data.)

Cursor

Presentation display drivers must support both monochrome and color cursors . Support for monochrome and color cursors is handled in three files in the S3 driver: EDDMCURS.C, EDDMCCRS.C, and EDDCURSR.ASM. EDDMCURS.C contains the eddm_DeviceSetCursor function. (The eddm_SetColourCursor routine in the EDDMCCRS.C file is similar to eddm_DeviceSetCursor.) The eddm_ DeviceSetCursor function sets the shape of a monochrome cursor, and then enables the cursor. The bulk of the function is involved in creating the input "AND" and "XOR" masks into a format suitable for an S3 chip. The eddm_DeviceSetCursor function is passed in a bit map. The AND mask is located first in the bit map. The XOR mask immediately follows the AND mask. When the masks are formatted for the hardware, they must be copied to off-screen memory, or in the case of an S3 chip with a Brooktree DAC**, copied to the DAC. When this has been completed, SetSpriteShape in EDDCURSR.ASM is called. This routine will either enable or disable the hardware cursor, depending on the state of cursor_data.cd_draw_sprite_flags.

The data structure for the cursor_data is located in CURSOR.H. It is a large structure, and it holds information such as the color depth the driver is currently running, the dimensions of the display, and so forth. Both of these can be found in the OS/2 Presentation Device Driver Reference. Every presentation display driver is required to export a MoveCursorBlock table of the format:

typedef struct _MCDESCRIPTION {
        PVOID pMoveCursor;      //pointer to the move cursor routine
        ULONG ulCodeLength;     //length of move cursor code
        PVOID pCursorData;      //pointer to data used by move cursor
        ULONG ulDataLength;     //length of move cursor data
} MCDESCRIPTION;

This structure is export ordinal 103. The following exports are located in 8514_32.def:

EXPORTS
        movecursorblock                   @103
        OS2_PM_DRV_ENABLE                 @200
        OS2_PM_DRV_QUERYSCREENRESOLUTIONS @202
        SEAMLESSTERMINATE                 @350
        SEAMLESSINITIALIZE                @351
        OS2_PM_DRV_RING_LEVELS            @999
        OS2_PM_DRV_ENABLE_LEVELS          @998

movecursorblock is located at the beginning of EDDCURSR.ASM:

movecursorblock equ this byte
        dd      OFFSET FLAT:_start_of_cursor_code
        dd      _end_of_cursor_code - _start_of_cursor_code
        dd      OFFSET FLAT:_start_of_cursor_data
        dd      _end_of_cursor_data - _start_of_cursor_data
_DATA           ends


_CURSOR         segment dword use32 public 'CODE'
                assume  cs:FLAT, ds:FLAT, es:FLAT

public  _start_of_cursor_code
_start_of_cursor_code:


;***********************************************************************
; This is the start of the 32-bit move cursor routine
; The interface is:
;       The stack holds 3 dword parameters:
;               The x and y coordinates of the cursor
;               and a pointer to locked global memory containing the
;               cursor data
;
; The internal entry point is used when calling from within the driver.
; The real entry point is the one that will be called at interrupt time.
;***********************************************************************

MoveCursor32:

In this code example, movecursorblock is the MoveCursorBlock structure. MoveCursor32 is exported in this manner because both its code and the data it uses will be called from contexts in which the linear addresses for the rest of the code and data in the driver are not defined. The MoveCursorBlock allows the caller of MoveCursor32 to create a linear address for the cursor code and data. For this reason, MoveCursor32, or any of the routines called by it, cannot access data outside the range of _start_of_cursor_data to _end_of_cursor_data. Between these labels, is the cursor_data structure. To access some data specific to your hardware in MoveCursor32, add it to the cursor_data structure.

MoveCursor32 gets either the new coordinates of the cursor, or if the coordinates are 0x8000, then it is a check cursor call. Check cursor is a way of periodically checking to determine if the cursor needs to be updated. For example, a MoveCursor request might fail because the hardware is currently in use. In that case, CheckCursor allows the driver to update the cursor position as soon as the cursor is not in use. MoveCursor32 also handles movement of the cursor.

There are two possibilities for the S3 driver, either a hardware cursor is in use, or a software cursor is in use. In the case of a hardware cursor, the function DrawSprite is called. DrawSprite calls either OutputSprite or BTOutputSprite depending on whether the S3 driver's internal hardware cursor is used, or the Brooktree DAC's hardware cursor is used.

Software cursors are slightly more complex. First, MoveCursor32 calls CheckXRegion, which checks to see if the cursor falls in the current exclusion region. If it does, then the cursor is not drawn. Otherwise, the routine software_cursor is called. The software_cursor routine calls internal_remove_software_cursor, and then draw_pointer. Moving a software cursor is accomplished as follows:

  1. Copy the save area to the position the cursor currently occupies. This erases the cursor. _internal_remove_software_cursor performs this function.
  2. Copy the new pointer position to the save area.
  3. XOR the XOR mask from off-screen VRAM to the cursor position.
  4. AND the AND mask from off-screen VRAM to the cursor position.
  5. OR in the color bit map from off-screen VRAM to the cursor position.

If a monochrome cursor must be handled in software for some reason, then the AND mask would be applied first, followed by the XOR mask. (In a monochrome cursor there is no color bit map.) The remainder of the complexity lies in handling cases including:

Hot spot moves to a negative coordinate.
Cursor is clipped on top.
Cursor is clipped on the bottom.
Cursor is clipped on the left or right side.

Hardware Scissoring

Most of the routines in the driver do not use the S3 driver's scissoring capability. All of the routines assume, however, that initially the scissors are set so all memory is accessible. Therefore, hardware scissoring is implemented and any routines that use it should re-open the scissors after using them.

Multimedia hooks

eddq_Escape - s3qesc.c (EDDQESC.C for XGA)

The eddq_Escape() function processes various escape functions supported by the S3 driver. One of the main uses for these escape functions is the support for full-motion video by OS/2 Multimedia. Full-motion video works by drawing directly into the frame buffer of the video device. In order to do this, MMPM/2 must obtain a pointer to the aperture of the video adapter. It then informs the driver that it is going to draw directly into a rectangular region on the display. If the aperture is smaller than the size of the memory on the video adapter, then MMPM/2 might have to switch banks while it is drawing. Finally, when it is finished, it must inform the presentation display driver that it has finished drawing onto the display. The escape functions that are required to support multimedia are:

DEVESC_GETAPERTURE 33000l

DEVESC_GETAPERTURE is handled by the function GetAperture(PULONG pcOutCount,
PAPERTURE pAperture).

What this function does is return the following structure (located in
eddtypet.h):

typedef struct _APERTURE {  /* aperture */
   ULONG ulPhysAddr;
   ULONG ulApertureSize;
   ULONG ulScanLineSize;
   RECTL rctlScreen;
} APERTURE;

The ulPhysAddr data field must be set to the physical address of the aperture in memory. In the case of the S3 driver, the aperture can be the 64KB aperture at A0000, a 1MB aperture located in the 24-bit AT-bus address space, or a 4MB aperture located in the 32-bit VL-bus address space. GetAperture() calls FindS3Aperture(). FindS3Aperture searches for the aperture in a number of ways.

First, it searches the environment to see if it can find a string matching "VIDEO_APERTURE." If so, it uses the value obtained as the base address of the aperture.

If it does not find an environment variable, it looks at the size of system memory, and several S3 registers in an attempt to find the address of the aperture. It then tests the aperture, and if the test fails, it falls back to the 64KB A0000 aperture.

The ulApertureSize field must be set, in bytes, to the size of the aperture. The ulScanLineSize field must be set, in bytes, to the width of a scanline. The rctlScreen function must be set to the minimum and maximum coordinates of the visible display. For example, at 1024 x 768:

rctlScreen.xLeft = 0;
rctlScreen.yTop = 0;
rctlScreen.xRight = 1023;
rctlScreen.yBottom = 767;

DEVESC_AQUIREFB 33010l

This function is used by MMPM/2 to obtain exclusive access to the frame buffer so that it may draw directly upon its surface. It is handled by the function AquireFB(ULONG cInCount, PAQUIREFB pAquireFB);

The input structure PAQUIREFB is defined as follows:

typedef struct _ACQUIREFB {  /* acquirefb */
   ULONG fAFBFlags;
   ULONG ulBankNumber;
   RECTL rctlXRegion;
} ACQUIREFB;
typedef ACQUIREFB *PACQUIREFB;

The fAFBFlags function is equal to 1 if the driver is to switch banks prior to giving access to the frame buffer. It equals 0 if no bank switch is needed. The ulBankNumber field is equal to the bank number to switch to, if bit 0 of the fAFBFlags is equal to 1. The rctlXRegion function is the area of the display being touched by MMPM/2, so the cursor might be excluded by the presentation display driver.

First, AquireFB() gets the driver semaphore. Then, it saves the address of the aperture being used by the presentation display driver. Next, it sets the bank, if that is needed. It then performs cursor exclusion if a software cursor is in use. Finally, it sets the location of the aperture, and enables it. (Some S3 chips, notably the older 86C801 and 86C805 chips, allow either the drawing engine or the frame buffer to be active, but not both simultaneously. Therefore, code that uses the frame buffer must first enable it, and then disable it when finished.) The aperture may be moved because the S3 driver attempts to use a 1MB, 2MB, or 4MB aperture for full-motion video to avoid the need for bank selects, thus improving performance.

DEVESC_DEAQUIREFB 33020l

This escape is handled by DequireFB(). DeaquireFB() is a function that disables the frame buffer used by MMPM/2, restores the old frame buffer location, and then de-excludes the software cursor, if needed. Finally, the driver semaphore is freed.

DEVESC_SWITCHBANK 33030l

This escape is handled by SwitchBank(ULONG cInCount, PULONG pInData). The pInData function is a pointer to a bank number. SwitchBank validates that the requested bank number is reasonable, saves it in the cursor_data structure to be restored by the cursor interrupt handler, and then sets the bank.

Death and Resurrection

The EDDMDEAD.C file contains the functions eddm_Death, and eddm_ Resurrection, which are used when switching to and from full-screen Windows **, DOS, or OS/2 sessions. The eddm_Death function handles the switch of the presentation display driver into the background. It switches the display to text mode, sets the drawing mode to software-drawing mode, and disables the bit map cache. The eddm_Resurrection function performs the inverse task. It switches the driver back into graphics mode, reloads the palette if needed, and invalidates the font cache.

Off-Screen VRAM Allocation - CACHEMAN.C

Unlike the XGA driver, which allocated off-screen VRAM in the EDDEVRAM.C file, or the 8514/A driver which used a fixed-allocation strategy, the S3 driver allocates off-screen VRAM in CACHEMAN.C. The CACHEMAN.C function contains two key tables, the HWMAP and the CACHEMAP. The following are the data structures for each:

typedef struct _HWMAP { /* hwm */
        ULONG   vis_width;
        ULONG   vis_height;
        ULONG   phys_width;
        ULONG   phys_height;
        ULONG   phys_memory;
        ULONG   BitCount;
        ULONG   hw_cursor;
        ULONG   hw_cursor_Y1024;
        ULONG   and_mask;
        ULONG   xor_mask;
        ULONG   save_area;
        ULONG   color_cursor;
        ULONG   start_bm_cache;
        ULONG   num_font_planes;
        ULONG   vertical_bmaps;
} HWMAP;

typedef HWMAP  *PHWMAP;

typedef struct _CACHEMAP { /* cm */
        ULONG   max_bit maps;
        ULONG   font_cache_start;
        ULONG   font_cache_left;
        ULONG   font_cache_top;
        ULONG   font_cache_right;
        ULONG   font_cache_bottom;
        ULONG   color_dither;
        ULONG   mono_dither;
} CACHEMAP;

The HWMAP contains information about the resolution, including the width of the display in bytes, locations of the hardware and software cursors, location of the bit-map cache, and the amount of memory required for the video mode. The CACHEMAP table has two entries for every entry in the HWMAP: one for normal driver operation, and one for seamless windows. When seamless windows is active, the size of the font cache is reduced. During initialization of the driver, in the function QueryAndSelectNativeMode, the resolution the user has requested is compared against the entries in the HWMAP. If it is obtainable, then the function CacheManager (in cacheman.c) is called. CacheManager takes the index into the HWMAP, as set up by QueryAndSelectNativeMode, and sets the global pointer pHWMap to point to the corresponding entry in the HWMAP. Likewise, this is done for pCacheMap, taking into account whether seamless windows is currently active. The entries in CACHEMAP and HWMAP that begin with 0xF0 are X-Y coordinates in VRAM concatenated together. (Concatenation is the usual method of addressing VRAM in this driver.)

HWACCESS.ASM

The routines in HWACCESS.ASM are among the most heavily used code in the driver. They are responsible for setting up key hardware values, copying data to or from off-screen VRAM, and resetting the hardware. The purpose of this section is to describe each routine, and where and how it is used. When adapting the S3 driver to another chip set, it is likely that one or more of these routines will have to be modified. Entirely new routines might need to be created to augment or even replace some of the ones mentioned.

TransferShadowRegisters

TransferShadowRegisters takes a single argument (which is a set of flags indicating the registers to copy) and is used throughout the driver to copy various values from the Shadow8514Regs to the graphics hardware. It can initiate drawing commands in the XGA driver; in the S3 driver, it cannot. In the S3 driver, the only function flag used is TSR_COLOUR_MIX, which sets the colors, hardware mix, monochrome expansion, and the color compare.

CopyMemoryToVRAM

CopyMemoryToVRAM takes the following parameters:

pVOID pSystemMemory - Pointer to the system memory bit map to copy
ULONG pVRAMAddress - Pointer to the destination in VRAM
ULONG width - width in pels
ULONG height - height in scanlines
ULONG format - flags indicating the pel depth of pSystemMemory, can be 1, 8,
               16, or 24 bits-per-pel

This function copies a bit map from system memory to VRAM. It is used by EDDMCURS.C and EDDMCCRS.C to copy the cursor image to off-screen VRAM. It is also used by EDDNCACH.C, to copy bit maps into the cache.

CopyVRAMToMemory

This function copies a bit map from VRAM to system memory and performs the exact opposite function of CopyMemoryToVRAM. It is used by PIXBLT.C, as well as CORVBITM.C, EDDSCNLR.C, and EDDBSETP.C, when it is necessary to read memory.

pVOID pSystemMemory - Pointer to the system-memory bit map to copy.
ULONG pVRAMAddress - Pointer to the source in VRAM.
ULONG width - Width in pels.
ULONG height - Height in scanlines.
ULONG format - Flags indicating the pel depth of pSystemMemory, can be 1, 8 , 16, or 24 bits-per-pel.

CopyDestToMemory

Performs exactly the same function as CopyVRAMToMemory, and consists of a jump into CopyVRAMToMemory. CopyDestToMemory is used in PIXBLT.C.

pVOID pSystemMemory - Pointer to the system-memory bit map to copy.
ULONG pVRAMAddress - Pointer to the source in VRAM.
ULONG width - Width in pels.
ULONG height - Height in scanlines.
ULONG format - Flags indicating the pel depth of pSystemMemory, can be 1, 8 , 16, or 24 bits-per-pel.

CopyMaskToVRAM

CopyMaskToVRAM is used by the cursor code to copy a 1-bit-per-pel mask into VRAM and uses the following parameters:

pVOID pSystemMemory - Pointer to the system-memory bit map to copy.
ULONG pVRAMAddress - Pointer to the destination in VRAM.
ULONG width - Width in pels.
ULONG height - Height in scanlines.
ULONG format - Flags indicating the pixel depth of pSystemMemory. The pel depth can be 1 bit-per-pel.

KlugeReset

KlugeReset takes no parameters. It is used to reset the drawing engine and to clear the screen.

eddf_MESS

The following code from EDDHBBLT.ASM is similar to code found throughout the driver.

; ss:esp now points to the destination coordinates:
  pop these straight into
; the hardware
        pop     eax
        memregwrite     dest_map, eax

; write the pixel op to kick off the blt
        memregwrite     pixel_op, edx

IFDEF HARD_DRAW

; see if the source is vram or not by checking vram marker
        IFDEF   _8514
        push    ebx
        mov     ebx, AIxfer.pbmhSrc
        mov     eax, [ebx].bit map_address
        and     eax, 0f0000000h
        cmp     eax, 0f0000000h
        pop     ebx
        jz      short @f
        call    SoftSrcSpecialist
        jmp     short soft_done

@@:     call    SrcSpecialist

soft_done:
        ENDIF   ;_8514

ELSE

        saveregs
        call    _eddf_MESS
        restoreregs

ENDIF

The following code from EDDHLNE.ASM shows the procedure for Soft-Draw mode. In this code, setup is created and calls eddf_MESS.

ELSE ;SOFT_DRAW

IFDEF _8514

        mov             eax,_SPad.ptsNewStart
        memregwrite     dest_map,eax

        mov             ax,_SPad.sK1
        memregwrite     sr_K1,ax

        mov             ax,_SPad.sK2
        memregwrite     sr_K2,ax

        mov             ax,_SPad.usErrorTerm
        memregwrite     Err_Term,ax

        mov             ax,_SPad.usMajorAxis
        memregwrite     dim1,ax

ENDIF ;_8514

        saveregs
        call    _eddf_MESS
        restoreregs
ENDIF

The S3 driver draws to bit maps by having eddf_MESS emulate certain XGA operations in software. The eddf_MESS routine interprets the Shadow8514Regs, based on the parameters therein, and the pel operation, then calls one of a number of routines to perform the requested drawing operation. For the most part, eddf_MESS is a collection of jump tables to routines that do the work. Each case and each pel depth are broken out into special cased routines. To some extent, this explains the size of the code. Individually, the routines themselves are not complex, although many of them are implemented as macros. The following is a list of the files that comprise the MESS, and the functions they perform.

Line Drawing:

The bulk of the code in the following functions is implemented as macro calls.

FFLIMES.ASM

 eddf_MESSDoBresenham
 eddf_MESSDoShortLines

Destination-Only BitBlts and Rectangular Fill Operations:

The following routines fill the destination with a solid color:

FFBLTD.ASM

 eddf_MESSBlockFill24
 eddf_MESSBlockFill16
 eddf_MESSBlockFill81
 eddf_MESSBlockFill4111

The following routines fill with a solid color, and also a mix:

 eddf_MESSFixedSrc16
 eddf_MESSFixedSrc81
 eddf_MESSFixedSrc24
 eddf_MESSFixedSrc4111

Pattern Blts:

The following routines copy a general pattern to the destination with a mix. This is a pattern with a monochrome bit map. (This is a different operation from the monochrome expansions used for text output.)

FFBLTPD.ASM

 eddf_MESSPatDest16
 eddf_MESSPatDest24
 eddf_MESSPatDest81
 eddf_MESSPatDest41
 eddf_MESSPatDest11
 eddf_MESSPatDestMixes11

The following routines handle pattern copy of general patterns:

 eddf_MESSPatCopy16
 eddf_MESSPatCopy24
 eddf_MESSPatCopy81
 eddf_MESSPatCopy41
 eddf_MESSPatCopy11

The following routines handle 8-bit-wide patterns with a mix:

 eddf_MESSPatDestByte24
 eddf_MESSPatDestByte16
 eddf_MESSPatDestByte81
 eddf_MESSPatDestByte41

The following routines handle pattern copy of 8-bit-wide patterns:

 eddf_MESSPatCopyByte16
 eddf_MESSPatCopyByte24
 eddf_MESSPatCopyByte81
 eddf_MESSPatCopyByte41
 eddf_MESSPatCopyByte11

Monochrome Expansion Blts

These BitBlts are used when drawing text in the software-drawing code. The principle difference between software-drawing mode and hardware-drawing mode routines in FFBLTPD.ASM is that text blts are from bottom-left to top-right, while pattern blts are from top-left to bottom-right. These routines are also optimized for text.

The following routines handle monochrome expansion with both a foreground and background mix:

FFBLTPX.ASM

 eddf_MESSPatExp16
 eddf_MESSPatExp81
 eddf_MESSPatExp24
 eddf_MESSPatExp41
 eddf_MESSPatExp11

The following routines handle monochrome expansion with a foreground mix and transparent background:

 eddf_MESSPatExpFg16
 eddf_MESSPatExpFg81
 eddf_MESSPatExpFg24
 eddf_MESSPatExpFg41
 eddf_MESSPatExpFg11

Setup:

The following routines handle setup for source and destination software BitBlt's eddf_MESSSetUpPatDest11.

FFBLTSET.ASM

 eddf_MESSSetUpSrcDest16
 eddf_MESSSetUpSrcDest81
 eddf_MESSSetUpSrcDest41
 eddf_MESSSetUpSrcDest11

The following routine handles setup for non-wrapping pattern setup (such as text):

eddf_MESSSetUpPatDest11

The following routines handle setup for destination-only blts:

 eddf_MESSSetUpDest16
 eddf_MESSSetUpDest81
 eddf_MESSSetUpDest24
 eddf_MESSSetUpDest41
 eddf_MESSSetUpDest11

The following routine handles common setup for 4-bit-per-pel pattern blts:

eddf_MESSSetUpDestForPat41

Source and Destination Blts:

The following routines makes a copy of the source to the destination:

FFBLTSD.ASM

eddf_MESSBlockCopy4111
eddf_MESSBlockCopy81
eddf_MESSBlockCopy24
eddf_MESSBlockCopy16

The following routines copy the source to the destination while applying a mix:

eddf_MESSSrcDestMix4111
eddf_MESSSrcDestMix81
eddf_MESSSrcDestMix24
eddf_MESSSrcDestMix16

The following routines copy the source to the destination, tiling it to fill the destination. eddf_MESSSrcDestWrapMix41:

eddf_MESSSrcDestWrapCopy41
eddf_MESSSrcDestWrapCopy81
eddf_MESSSrcDestWrapCopy24
eddf_MESSSrcDestWrapCopy16

The following routines copy the source to the destination with a mix, tiling it to fill the destination:

eddf_MESSSrcDestWrapMix81
eddf_MESSSrcDestWrapMix24
eddf_MESSSrcDestWrapMix16

16-Bit-Per-Pel Support

With the exception of the S3 911 and 924 chips, the S3 chips directly support 16-bit-per-pel drawing operations in hardware. As a result, there is little that needs to be changed in the driver code to support 16-bit-per-pel modes on the various S3 chips. This will most likely be true on most graphics chips. There are, however, two issues that can arise: byte ordering and the location of the red, green, and blue components of the pel. The S3 driver uses a 5,6,5 pel for 16 bits-per-pel. (Green gets the extra bit.) This is exactly the same mode as the XGA. If your adapter uses a different byte ordering or runs 5, 5, 6, or 5, 5, 5, the functions will have to be modified so the bit maps can be converted from one format to another, as well as changing the code to represent 16-bit color in eddf_MESS. The files that handle bit-map conversions are the following:

CONVBITM.C
CONVFUNS.C
CONVFUNS.H
CONVINT.C
CONVEXT.C
CONVERT.ASM

Of these, the bulk of the changes will be made in CONVFUNS.H, which is a package of macros for setting pels in memory bit maps. To swap the pels in CONVFUNS.H, the following macros need to be altered (the changes are in the ifdef SWAP16BPP sections):

   /**********************************************************************/
   /* Macro to convert to a 16 bit value in Intel format.                */
   /*                                                                    */
   /* Because XGA bit maps in system memory are stored in Motorola       */
   /* format, this requires swapping the low and high bytes.             */
   /**********************************************************************/

#ifdef SWAP16BPP
#define XGA_EnsureIntel16bpp(word)  (word)
#else
#define XGA_EnsureIntel16bpp(word)                                     \
    ((((word) & 0xFF00) >> 8) | (((word) & 0x00FF) << 8))
#endif


   /**********************************************************************/
   /* Macro to write a 16bpp value into the internal bit map, and to     */
   /* increment the internal bit map pointer to the next 16bpp pel.      */
   /* This must be done in Motorola format (high byte, low byte).        */
   /**********************************************************************/

#ifdef SWAP16BPP
#define XGA_SetInternal16BppPel(value)                                 \
    *((PBYTE)SPad.pbTrgPointer)++ = ((value) & 0xff);                  \
    *((PBYTE)SPad.pbTrgPointer)++ = ((value) >> 8);
#else
#define XGA_SetInternal16BppPel(value)                                 \
    *((PRGB16)SPad.pbTrgPointer)++ = value;
#endif

In addition, to change the R, G, and B components of the 16 bit-pel, alter the macro XGA_RGB16FromPRGB2, which is also in CONVFUNS.H. Next, alter the function ConvertExt8ToInt16, which is in CONVERT.ASM. In that function, there are several xchg al, ah instructions. Remove them. Likewise, in ConverExt24toInt16, you will see raster operation ax, 8. Remove it as well.

Finally, make a change to EDDFFAST.ASM, which holds the eddf_MESS function. In it, near the top of the file, you will see code similar to the following :

        mov     ax, word ptr Shadow8514Regs.Color_1
        xchg    ah, al
        mov     word ptr Shadow8514Regs.Color_1, ax
        mov     ax, word ptr Shadow8514Regs.Color_0
        xchg    ah, al
        mov     word ptr Shadow8514Regs.Color_0, ax
        mov     ax, word ptr Shadow8514Regs.Color_Comp
        xchg    ah, al
        mov     word ptr Shadow8514Regs.Color_Comp, ax

Disable this code. It is near the top of eddf_MESS, and immediately after the label exitMESS. It must be disabled in both places.

The modules listed above, along with EDDBCREA.C are useful to study if you are interested in the creation and conversion of bit maps in the driver.

24-Bit-Per-Pel Support

The various S3 chips do not directly support hardware drawing of packed 24-bit-per-pel modes. Some of them include drawing engine support for 32-bit-per-pel modes, which can be found in the IBM Device Driver Source Kit for OS/2, Version 1.2; but the S3 driver does not support this mode. Instead, the S3 driver attempts to use the drawing engine where it can, and falls back to the frame buffer in situations where it is unable to use the drawing engine. All line drawing is done in software at 24-bits-per-pel. Screen-to-screen copy bitblts can be processed, but fills of any sort, including the monochrome expansion operations, can be performed only if the red, green, and blue components of the color are identical. The drawing engine treats 24-bit-per-pel modes as if they are 8-bit-per-pel modes with extremely wide scanlines. Because the drawing engine is addressing bytes rather than 24-bit pels, it is necessary to triple all coordinates before writing them to the hardware. Note that throughout the driver you will find the following construction for tripling coordinates:

fast3x: ;The fast way to triple ax
        mov     bx, ax
        shl     ax, 1
        add     ax, bx

The following is a faster way to triple ax on a 486 microprocessor.

;       Triple eax
fast3x:
        lea     eax, [eax+2*eax]

In terms of changing the byte ordering of 24-bit-per-pel modes, the S3 driver supports two pel orderings, RGB and BGR. To change between the two, set or clear the bit USE_ATTDAC in DDT.fScreenFlags. The bit-map conversion code looks at this flag and performs the correct function, dependent upon its setting.

In terms of changing the byte ordering of 24-bit-per-pel modes, the S3 driver supports two pel orderings, RGB and BGR. The RGB pel ordering is supported by many RAM DACs and is the 'normal' ordering of pixels in 24-bit modes. In RGB pel ordering, the blue pel is first in memory, followed by the green pel, followed by the red pel. The RGB2 structure is defined with the same pel ordering:

 typedef struct _RGB2 {  /* rgb2 */
        BYTE bBlue;      /* Blue component of the color definition  */
        BYTE bGreen;     /* Green component of the color definition */
        BYTE bRed;       /* Red component of the color definition   */
        BYTE fcOptions;  /* Reserved, must be zero                  */
     } RGB2;
     typedef RGB2 Far *PRGB2;

GB pel ordering is chosen by setting the USE_ATTDAC bit in the DDT. fScreenFlags. (Although the name of this bit implies that an AT&T** DAC is present, this is not so.) Clearing the USE_ATTDAC bit in the DTT. fScreenFlags forces BGR byte ordering which stores the red pel first, followed by the green pel, followed by the blue pel. The USE_ATTDAC bit forces all bit maps to use RGB byte ordering. Following is an example from CONVEXT.C:

/*********************************************************************/
/* ConvertEx24ToInt24                                                */
/*                                                                   */
/* Converts a single scanline from external 24bpp to internal 24bpp. */
/* (No convert table is used a 24 bpp.                               */
/*********************************************************************/
VOID ConvertExt24ToInt24 (VOID)
{
    ULONG       j;
    rgb         TrgPel;

    for ( j = SPad.cx; j--;)
    {
        TrgPel = *((PRGB)SPad.pbSrcPointer);
        if ( !(DDT.fScreenFlags & USE_ATTDAC) ) {
            *((PBYTE)SPad.pbTrgPointer)++ = TrgPel.bBlue;
            *((PBYTE)SPad.pbTrgPointer)++ = TrgPel.bGreen;
            *((PBYTE)SPad.pbTrgPointer)++ = TrgPel.bRed;
        }
        else 
        {
            *((PBYTE)SPad.pbTrgPointer)++ = TrgPel.bRed;
            *((PBYTE)SPad.pbTrgPointer)++ = TrgPel.bGreen;
            *((PBYTE)SPad.pbTrgPointer)++ = TrgPel.bBlue;
        }
        SPad.pbSrcPointer += sizeof(RGB);
     {
} /* ConvertExt24ToInt24 */

When the USE_ATTDAC bit is set, the blue pel is first in memory and when USE_ATTDAC bit is not set, the red pel is first in memory. This is the code that translates an external bit map to an internal bit map. However 24-bit color values are not translated in the same way. This is shown in LogToPhyIndex in EDDCSUBR.C:

#ifdef    BPP24
     else if (DDT.BitCount == 24)
     {
     /*************************************************************/
     /* Map the RGB2 value into an RGB24 value. (24bpp pel value) */
     /*************************************************************/
        if ( !(DDT.fScreenFlags & USE_ATTDAC) ) {
          PhyIndex = RGB24FromPRGB2(&RegRGB);
        }
        else
        {
          PhyIndex = RGBRGB24FromPRGB2(&RegRGB);
        }
     }
#endif

The following RGB24From macros are located in CONVFUNS.H:

#define RGB24FromPRGB2(prgb2)         XGA_RGB24FromPRGB2(prgb2)
#define RGB24FromPRGB2ATT(prgb2)      XGA_RGB24FromPRGB2ATT(prgb2)

#define XGA_RGB24FromPRGB2(prgb2)
      *(PULONG)(prgb2)

#define XGA_RGB24FromPRGB2ATT(prgb2)
    ((((ULONG)((prgb2)->bRed))) |                      \
     (((ULONG)((prgb2)->bGreen)) <<8) |                \
     (((ULONG)((prgb2)->bBlue)) <<16) )                \

LogToPhyIndex is the routine responsible for translating the logical RGB2 values passed in from PM to colors that are realizable in the driver. It translates almost every color value used in the driver. When the USE_ATTDAC bit is not set, the color passed in is copied over directly, even though PM passes in RGB format colors. When the USE_ATTDAC bit is set, the color value is byte-swapped to BGR order. The following code in Copy24MonoToVRAM (HWACCESS.ASM) explains why this occurs:

;set the colors to expand to and put them in the correct format
;to write dwords

    mov     edx,Shadow8514Regs.Color_1
    mov     ebx,Shadow8514Regs.Color_0
    ror     dx,8
    ror     edx,16
    ror     dx,8
    ror     edx,8
    ror     bx,8
    ror     ebx,8
    ror     dx,8
    ror     ebx,8

Copy24MonoToVRAM performs monochrome to color expansion in 24-bit-per-pel modes in the S3 driver. Notice that it is performing a byte-swap on the colors that were passed into it as arguments in the Shadow8514Regs. The reason the for this reversed color storage has to do with the implementation of eddf_MESS. In eddf_MESS, 24-bit colors are written in reverse order. Consequently, if your device supports 24-bit-per-pel operations in hardware, you must reverse the bytes of Shadow8514Regs.Color_0 and Color1 prior to writing them to the color registers of your hardware.

Strategies for Adapting the Driver to Other Chip Sets

To date, the S3-8514-XGA driver has been ported to work on the various Western Digital** accelerators, The Tseng Labs W32 and W32p**, The ATI** Mach 64, and the ATI Mach 32. Of these, only the ATI Mach 32** is similar to an 8514/A**. The amount of code in the driver that is hardware-dependent is relatively small. Still, this is a large set of source modules, and so it is helpful to know where to begin when changing them.

The first part of this section examines the various strategies that can be adopted for porting this driver, and some of the tradeoffs that they entail. The second part of this section consists of an outline for bringing up the driver on a new graphics chip set.

High-Level Design Decisions

There are three basic approaches that can be taken to adapt the S3 driver to another chip set. All of these approaches differ in how they handle the Shadow8514Regs. The Shadow8514Regs are shared by both the hardware-dependent code and the bit-map drawing code. The eddf_MESS() function relies on the fields in Shadow8514Regs, as well as the eddh_SrcDestBlt function.

Strategy 1 - Change the Shadow8514Regs

Retain the pixmaps and other characteristics of the XGA driver and change the 8514 registers to match the new driver. This is the approach that was taken when the 8514/A driver was ported from the original XGA-code base. This is the most efficient solution, because the values from the new renamed driver can be written directly to the hardware. The tradeoff here is that the bit-map drawing code in EDDFFAST.ASM, FFLINES.ASM, FFBLTSET.ASM, and so forth, use the shadow registers extensively; therefore, modifying them will be a significant portion of the work. The other tradeoff is that functions such as eddn_BitBlt and PixBltThroughClipsViaPhunk make extensive use of these registers, and changing them may mean making significant changes to the logic of these complex functions.

Strategy 2 - Add new definitions to the Shadow Registers

Another strategy is to add definitions to the shadow registers and bit-map header definition to support the new chip set, but leave the XGA and 8514 definitions in place. This has the advantage of allowing the values in the shadow registers to be written directly to hardware. The disadvantage is that much of the setup code for BitBlt, line drawing, and so forth, will have to be changed to accommodate these new register definitions. This strategy will be time-consuming and error-prone.

Strategy 3 - Leave the Shadow Registers As Is

In some ways, this strategy is the easiest approach. It is possible to translate the values written into the Shadow8514Regs to ones used by another chip. The bulk of the driver code never accesses the actual graphics-accelerator hardware. In fact, only the routines in HWACCESS.ASM, such as TransferShadowRegisters, and the routines in EDDHBBLT.ASM, EDDHGCHS.ASM, and so forth, actually write to the hardware. In many cases, it is easy to interpret the values written in the Shadow8514Regs structure, and translate them in TransferShadowRegs. TransferShadowRegs sets up colors, hardware-raster operations, monochrome expansion and color compare. It is called by the highest level portions of the driver, and may get called only once for several operations. As a consequence, performing translations on it is efficient.

Typically, the lower-level routines, such as eddh_PatDestBlt, write the origin, direction, height, and width of the desired BitBlt. These are usually very straightforward to translate from 8514/A format. In effect, this approach treats the shadow registers as hardware-independent descriptions of an accelerator operation that needs to be performed. (The fact that this hardware-independent description happens to be very dependent on the designs of the XGA and 8514 is a workable solution.)

Other Considerations

The S3 driver caches fonts and other monochrome data in off-screen VRAM, and uses the hardware BitBlt engine to expand these to the display. Not all devices are capable of doing this. If your chip set does not have this capability, then the text code will need a considerable amount of work. Another consideration is that the S3 driver treats objects in VRAM as rectangles with an X-Y origin. Some devices work this way, others reference portions of VRAM by address. Some can deal only with off-screen bit maps that have the same pitch as the display. Others can work with off-screen bit maps with packed scan lines or with the scan lines padded to some boundary. These considerations will affect the design of the bit map and font-caching code.

Porting the Code

The following is an outline of a scheme to port the S3 driver to another chip set. Many of the actual working details will be dependent upon your device. The goal is to give you a place to start by showing the first modules to interact directly with the hardware. When some of the functions are working, the kind of design that is needed will become apparent, and you may want to design your own plan.

Initializing the Accelerator

The following is a description for how to set a mode and initialize the accelerator.

  1. Edit CACHEMAN.C to account for the resolutions supported by your device, and the memory requirements for those resolutions.
  2. Edit the asr[] array in EDDESRES.C to match the resolutions your device can support. (See Multiple Resolution Support for information on driver initialization for a description of this array.)
  3. Edit the QueryAndSelectNativeMode routine in the EDDESRES.C file. Add code to determine the memory size of the adapter and other configuration information.
  4. Edit SetObtainableModes routine in the MODEINFO.C file. In the S3 driver, this function reads the SVGADATA.PMI file and marks modes in the asrScreenResolution table as obtainable, based on what it finds there. It is also possible to directly query the hardware. Another possibility is to discard SetObtainableModes altogether, and modify CACHEMAN.C to validate the available modes based on the amount of VRAM and DAC configuration of the board.
  5. Edit QueryAndSelectNativeMode and remove the DMS32CallBack((PFN) (SwitchToChosenMode). This is almost certainly not what you want to do at this time. For some odd reason, the S3 driver sets the mode twice during initialization. There is no real reason to do this, however.
  6. Edit SwitchToExtendedGraphicsMode in SETMODE.C so that it will set the appropriate video mode on your device. This may be done by way of the base-video handler or in assembler code that you write as part of your driver. Process the mode set in the presentation display driver if the video handler is not yet working properly. Adding mode set code to the display driver may let you work while the base video handler is being developed. ( Unless one developer is writing both the base video handler and the Presentation Display Driver).
  7. If your device's registers are memory-mapped rather than I/O-mapped, additional changes need to be made to FillPdb(). In the S3 driver, a flat pointer to the video aperture is obtained after the mode set and KlugeReset (). This will not work for a memory-mapped device. Move the first call to GetVRAMPointer() so it occurs before the first drawing-engine operation. ( It does not matter the order in which GetVRAMPointer is called.) Also, if your device supports an aperture, this is a good place to test if the aperture-to-video memory has been properly performed. (See Obtaining Pointers to Video Memory for details on how to do this.) To test this, create a small routine that copies a fixed bit map to the screen through the aperture. If this routine generates a page fault exception (TRAP 0E), then you do not have a valid pointer to the aperture.
  8. Edit KlugeReset (in HWACCESS.ASM) so it can reset the accelerator and clear the screen. Assemble code in an INT 3 immediately after the code that resets the accelerator. For debugging purposes, create some small routines that perform a few basic drawing operations on the display using the accelerator. This is a good test to ensure that everything is set up correctly. (If your drawing engine is not set correctly, nothing in the driver is going to work.) Also, remove any code from the various pieces of initialization code that runs on an S3 chip rather than your chip set.
  9. At this point, some code may need to be debugged. Put an _asm {INT 3} in FillPdb() (EDDEFPDB.C), immediately before the call to QueryAndSelectNativeMode(). (The _asm {INT 3} will cause an INT 3 instruction to be inserted in the code, which will cause the kernel debugger to break when it reaches that point.) Build the driver and run it under the debug kernel. If the INT 3 you previously added is reached, the debugging has been successful. If you are unable to reach the INT 3, try adding one to the beginning of FillLdb() in EDDEFLDB.C. This occurs early in the driver. If you are unable get to there, add one in the LOADPROC in DYNA32.ASM. This is the very first entry into the driver. If you are unable to get there, then you are not successfully building the driver. Look at your make file for any inconsistencies.
  10. Continue going through routines debugging until you successfully get to KlugeReset. At this point, you will be in a graphics mode. Step through the code that enables the drawing engine of your chip, and test it with the small routines that were written in step 7. Continue until you get a good mode set, have the drawing engine enabled, and are able to do some simple drawing commands inside the KlugeReset routine.

The First BitBlt

The first operations performed by the driver are in BitBlt. Edit EDDHBBLT.ASM, modifying it to work with your hardware. The following are guidelines you can use:

  1. The very first BitBlt processed in the driver is a pattern blt performed by eddh_PatDestBlt (in EDDHBBLT.ASM). This paints the gray background screen in 8-bit-per-pel mode. (The seemingly solid gray is a 2x2 dither.) Consequently, eddh_PatDestBlt is the first part of the driver to modify.
  2. Modify TransferShadowRegisters in HWACCESS.ASM. This routine sets the color, raster operation, and other operation-specific parameters. If you do not want to perform this operation, you can temporarily get the color, pattern, and other information needed to complete the blt from the Shadow8514Regs, and the AIxfer parameter block, and put code in-line in eddh_PatDestBlt to set these parameters. Eventually, return and make TransferShadowRegisters work correctly.
  3. When eddh_PatDestBlt is working, modify eddh_SrcDestBlt and eddh_DestOnlyBlt, in that order. eddh_SrcDestBlt consists of several cases. The easiest portion to modify is the part that deals with VRAM-to-VRAM blts. Bring up the portion which copies color bit maps in system memory to the display. (This is used early in the construction of the desktop.)
  4. Modify CopyMemoryToVRAM in HWACCESS.ASM at this time using the aperture-to-VRAM on your video device. For the most part, the S3 driver uses the drawing engine to copy system-memory bit maps to VRAM. This option may not be available on your device. (Or, it may be more efficient on your device to use the aperture.)
  5. Edit EDDHLINE.ASM and EDDHSCAN.ASM, and ifdef out any code that draws anything with the S3 hardware. This will allow these functions to be called without hanging the driver. Likewise, ifdef out the code in EDDHGCHS.ASM. This will temporarily disable text. (If this is performed carefully, mark the places where you need to insert code for your device.)
  6. When eddh_PatDestBlt, eddh_SrcDestBlt, and eddh_DestOnlyBlt are working, the presentation desktop (with no text) appears.
  7. If caching either monochrome or color bit maps is undesirable during the development of eddh_SrcDestBlt, you can disable one or both of them in PIXBLT.C. (See BitBlt for further information.)

The Cursor

  1. Edit CACHEMAN.C so off-screen memory is allocated for the hardware cursor. Remember that cursors can be larger than 32 x 32
  2. Edit the eddm_DeviceSetCursor function in the EDDMCURS.C file so it copies the AND and XOR masks into off-screen VRAM. The primary change will be to reformat the incoming AND and XOR masks into a form suitable for use by the hardware cursor.
  3. Edit EDDCURSR.ASM, and add code to support the hardware cursor.

Text Output

After BitBlt is functioning, create text output. (If more than one developer is working on the driver, then text and BitBlt can be done simultaneously, when the driver is correctly initialized and some of the rudiments of BitBlt are functional.)

  1. The first task is to modify the caching code in EDDNCACH.C so it copies the character bit maps into off-screen VRAM in a format suitable for your hardware. If your hardware is incapable of VRAM-to-VRAM monochrome expansion, set CharTooBig to be TRUE in EDDNGCHS.ASM, and let BitBlt handle text output until a non-cached scheme is created.
  2. The routines that handle caching are eddt_CacheCharacter, in EDDNCACH.C, and Cache8514Char, in EDDHGCHS.ASM. Cache8514Char may be unnecessary for your device. A memcpy may work just as well.
  3. Modify eddh_DrawText in EDDHGCHS.ASM. Most of the modifications will probably be confined to the sections marked IFDEF HARD_DRAW ... IFDEF_8514.
  4. Debug the above code. It is difficult to debug the character-caching code without actually drawing any characters. Unless the character-caching code is working, it is impossible to draw any characters correctly.

Miscellaneous

  1. Modify EDDHLINE.ASM and bring up line drawing.
  2. Modify EDDHSCAN.ASM, and bring up scanlines. At this point, the presentation desktop should appear complete.
  3. Modify EDDHAVIO.ASM, and ensure the AVIO text is functioning. (Test with OS/2 window sessions.)
  4. Modify EDDMDEAD.C so that death and resurrection are supported. (Test with a full-screen OS/2 session.)
  5. Modify EDDHIMAG.ASM, and bring up image data. (Use the tune editor to test this.)
  6. Finish any routines in HWACCESS.ASM that have not already been ported.
  7. Write code for your device to support the color cursor in EDDCURSR.ASM. (Test with the Solitaire program using auto-play mode.)

Debugging Tips

Debugging the S3 driver is difficult because the kernel debugger does not support source-level debugging of C code. The bulk of the code that must be modified is in assembler, and the kernel debugger handles this adequately. However, there will be occasions when the C portions of the driver must be modified and debugged. Some of the C functions are quite large, and have no internal symbols that are public. Therefore, if you trap in the middle of a large C function, it is often difficult to determine the problem. Also, if you need to set a breakpoint in the middle of a C function, it is often difficult to know precisely where to put it. The way around this particular problem is to put INT 3 instructions near the portion of the code where you suspect the problem to occur. There are two ways to do this. One way is to insert a call to the function haltproc() into the C code. The other is to add the following statement:

_asm {int 3};

Usually it is more convenient to add the _asm {INT 3} statement, as this breaks into the debugger at the point in the code at which you are interested. The haltproc() routine will also break into the kernel debugger, but it is less convenient because you have to trace out of haltproc() to see the real breakpoint location.

Most of the data in this driver is passed around in data structures, such as AIxfer. It is possible to dump these structures easily, even if they are being manipulated by C code. This tends to be easier than dumping stack frames. The ability to recognize which assembler code corresponds to a particular C language statement is necessary when debugging the portions of the driver that are written in C. However, using the C compiler to generate an assembler language version of the module is also helpful. This is accomplished using the /Fa compiler option. Also, when debugging C code, use /Od, which disables optimizations. This makes the code easier to follow.

The first time you build and test your driver, there is a possibility that it might not load, or that it will load, trap, and then immediately unload. The first challenge you face is getting the driver to load, initialize, and set the desired video mode. The first place to set a breakpoint is either loadproc(), which is in DYNA32.ASM, or in XGA_DLLInit, which is in INIT.C. Also, OS2_PM_DRV_ENABLE, in EDDENABL.C, is a good place to break, as it is called very early in the initialization process. FillLdb(), in EDDEFLDB.C, is another place to break. It is the first function in the driver that does any real work. FillPdb()in EDDEFPDB.C, and QueryAndSelectNativeMode(), in EDDESRES.C, are also good candidates for break points. You might want to trace through FillLdb(), FillPdb(), and QueryAndSelectNativeMode() until the driver initialization phase is stable. Any problems thus far are very likely to be in one of these three functions, or in a function that they call.

Some other good breakpoint locations include eddh_SrcDestBlt, eddh_ PatDestBlt, eddh_DestOnlyBlt, eddh_DrawText, eddh_PMLINES, eddh_PMSCANLINE, and eddh_PMIMAGEDATA. Many early problems will be drawing related and will be confined to one of these functions.

If a trap error occurs, such as trap d (General protection error) or trap e , use the "vcf*" debugger command to break on instructions that cause traps . When the debugger breaks on the offending instruction, use "ln" (list near) to give you a nearby label, and also the "ks" to give you a stack trace, which gives a "map" detailing where you have been.

Often graphics chips must be polled to determine if they are ready to accept another command. Consider making the polling loop a macro. In the macro, for debugging mode only, add a drop-dead timer, such as counting down "ecx" from 1,000,000. If the "timer" expires, fall through to an INT 3, so the debugger breaks. This enables you to find situations where the driver will hang while waiting on the video chip. Typically, this is caused by a programming error such as an invalid command to the graphics chip.

Consider using Debug32Output to print a trace of the functions called in the driver when you are unable to determine your present location. The following is an example that traces every major driver entry point.

pmtrace.asm:

.386p

_DATA   segment dword use32 public 'DATA'
_DATA   ends

trace   macro   fn
        _DATA   segment
        s_&fn   db      '&fn, ', 0
        _DATA   ends

        extrn   fn : near
        public  T&fn
        T&fn proc near
                push    offset ds:s_&fn
                call    Debug32Output
                pop     eax
                jmp     fn
        T&fn endp

        endm

_TEXT   segment use32 dword public 'CODE'
        assume  cs:FLAT, ds:FLAT, es:FLAT

extrn   Debug32Output : near

trace eddl_PolyLine
trace eddl_GetCurrentPosition
trace eddl_SetCurrentPosition
trace eddl_DrawLinesInPath
trace eddl_PolyShortLine
trace edds_PolyScanLine
trace DrawBits
trace eddb_DeviceCreatebit map
trace eddb_DeviceDeletebit map
trace eddb_DeviceSelectbit map
trace eddb_BitBlt
trace eddb_GetPel
trace eddb_SetPel
trace eddb_ImageData
trace ScanLR
trace eddm_SaveScreenBits
trace eddb_DrawBorder
trace eddm_DeviceSetCursor
trace Getbit mapBits
trace Setbit mapBits
trace eddm_SetColorCursor
trace eddt_CharString
trace eddt_CharStringPos
trace eddb_PolyMarker
trace eddv_CharRect
trace eddv_CharStr
trace eddv_ScrollRect
trace eddv_UpdateCursor
trace edda_DeviceGetAttribut
trace eddv_DeviceSetAVIOFont2
trace edda_GetPairKerningTable
trace edda_DeviceSetAttributes
trace edda_DeviceSetGlobalAttribute
trace edda_NotifyClipChange
trace eddm_NotifyTransformChange
trace edda_RealizeFont
trace eddm_ErasePS
trace eddl_SetStyleRatio
trace edda_DeviceQueryFontAttributes
trace edda_DeviceQueryFonts
trace edda_DeviceInvalidateVisRegion
trace eddg_GetPickWindow
trace eddg_SetPickWindow
trace eddg_ResetBounds
trace eddg_GetBoundsData
trace eddg_AccumulateBounds
trace edda_GetCodePage
trace edda_SetCodePage
trace eddm_LockDevice
trace eddm_UnlockDevice
trace eddm_Death
trace eddm_Resurrection
trace edda_GetDCOrigin
trace edda_DeviceSetDCOrigin
trace eddl_GetLineOrigin
trace eddl_SetLineOrigin
trace eddl_GetStyleRatio
trace eddc_QueryColorData
trace eddc_QueryLogColorTable
trace eddc_CreateLogColorTable
trace eddc_RealizeColorTable 
trace eddc_UnrealizeColorTable 
trace eddc_QueryRealColors 
trace eddc_QueryNearestColor 
trace eddc_QueryColorIndex 
trace eddc_QueryRGBColor 
trace eddq_QueryDevicebitmaps
trace eddq_QueryDeviceCaps
trace eddq_Escape
trace eddq_QueryHardcopyCaps 
trace eddm_QueryDevResource 
trace DeviceCreatePalette 
trace DeviceDeletePalette 
trace DeviceSetPaletteEntries 
trace DeviceAnimatePalette 
trace DeviceResizePalette 
trace RealizePalette 
trace QueryHWPaletteInfo 
trace UpdateColors 
trace QueryPaletteRealization 
_ TEXT     ends
         end

This file, along with the following changes to EDDEFLDB.C will cause the name of every driver entry point that is called to be printed on the debug terminal:

//Changes to eddefldb.c - Replace real driver entry points with dummies
//that print a trace, and then call the real entry point.
DSPENTRY Teddl_PolyLine ();
DSPENTRY Teddl_GetCurrentPosition ();
DSPENTRY Teddl_SetCurrentPosition ();
DSPENTRY Teddl_DrawLinesInPath ();
DSPENTRY Teddl_PolyShortLine ();
DSPENTRY Tedds_PolyScanLine ();
DSPENTRY TGetScreenBits ();
DSPENTRY TSetScreenBits ();
DSPENTRY TDrawBits ();
DSPENTRY Teddb_DeviceCreatebit map ();
DSPENTRY Teddb_DeviceDeletebit map ();
DSPENTRY Teddb_DeviceSelectbit map ();
DSPENTRY Teddb_BitBlt ();
DSPENTRY Teddb_GetPel ();
DSPENTRY Teddb_SetPel ();
DSPENTRY Teddb_ImageData ();
DSPENTRY TScanLR ();
DSPENTRY Teddm_SaveScreenBits ();
DSPENTRY Teddb_DrawBorder ();
DSPENTRY Teddm_DeviceSetCursor ();
DSPENTRY TGetbit mapBits ();
DSPENTRY TSetbit mapBits ();
DSPENTRY Teddm_SetColorCursor ();
DSPENTRY Teddt_CharString ();
DSPENTRY Teddt_CharStringPos ();
DSPENTRY Teddb_PolyMarker ();
DSPENTRY Teddv_CharRect ();
DSPENTRY Teddv_CharStr ();
DSPENTRY Teddv_ScrollRect ();
DSPENTRY Teddv_UpdateCursor ();
DSPENTRY Tedda_DeviceGetAttributes ();
DSPENTRY Teddv_DeviceSetAVIOFont2 ();
DSPENTRY Tedda_GetPairKerningTable ();
DSPENTRY Tedda_DeviceSetAttributes ();
DSPENTRY Tedda_DeviceSetGlobalAttribute ();
DSPENTRY Tedda_NotifyClipChange ();
DSPENTRY Teddm_NotifyTransformChange ();
DSPENTRY Tedda_RealizeFont ();
DSPENTRY Teddm_ErasePS ();
DSPENTRY Teddl_SetStyleRatio ();
DSPENTRY Tedda_DeviceQueryFontAttributes ();
DSPENTRY Tedda_DeviceQueryFonts ();
DSPENTRY Tedda_DeviceInvalidateVisRegion ();
DSPENTRY Teddg_GetPickWindow ();
DSPENTRY Teddg_SetPickWindow ();
DSPENTRY Teddg_ResetBounds ();
DSPENTRY Teddg_GetBoundsData ();
DSPENTRY Teddg_AccumulateBounds ();
DSPENTRY Tedda_GetCodePage ();
DSPENTRY Tedda_SetCodePage ();
DSPENTRY Teddm_LockDevice ();
DSPENTRY Teddm_UnlockDevice ();
DSPENTRY Teddm_Death ();
DSPENTRY Teddm_Resurrection ();
DSPENTRY Tedda_GetDCOrigin ();
DSPENTRY Tedda_DeviceSetDCOrigin ();
DSPENTRY Teddl_GetLineOrigin ();
DSPENTRY Teddl_SetLineOrigin ();
DSPENTRY Teddl_GetStyleRatio ();
DSPENTRY Teddc_QueryColorData ();
DSPENTRY Teddc_QueryLogColorTable ();
DSPENTRY Teddc_CreateLogColorTable ();
DSPENTRY Teddc_RealizeColorTable ();
DSPENTRY Teddc_UnrealizeColorTable ();
DSPENTRY Teddc_QueryRealColors ();
DSPENTRY Teddc_QueryNearestColor ();
DSPENTRY Teddc_QueryColorIndex ();
DSPENTRY Teddc_QueryRGBColor ();
DSPENTRY Teddq_QueryDevicebit maps ();
DSPENTRY Teddq_QueryDeviceCaps ();
DSPENTRY Teddq_Escape ();
DSPENTRY Teddq_QueryHardcopyCaps ();
DSPENTRY Teddm_QueryDevResource ();
DSPENTRY TDeviceCreatePalette ();
DSPENTRY TDeviceDeletePalette ();
DSPENTRY TDeviceSetPaletteEntries ();
DSPENTRY TDeviceAnimatePalette ();
DSPENTRY TDeviceResizePalette ();
DSPENTRY TRealizePalette ();
DSPENTRY TQueryHWPaletteInfo ();
DSPENTRY TUpdateColors ();
DSPENTRY TQueryPaletteRealization ();
#define C(F) F                  /* Just call it */
#if 1
#define T(F) T##F               /* Print trace message and call */
#else
#define T(F) F                  /* Print trace message and call */
#endif
PPFNL   EnginesDispatchTable;
PFNL    DriversDispatchTable[] = {
    C(0),                       //GreGetArcParameters 0x4000
    C(0),                       //GreSetArcParameters 0x4001
    C(0),                       //GreArc 0x4002
    C(0),                       //GrePartialArc 0x4003
    C(0),                       //GreFullArcInterior 0x4004
    C(0),                       //GreFullArcBoundary 0x4005
    C(0),                       //GreFullArcBoth 0x4006
    C(0),                       //GreBoxInterior 0x4007
    C(0),                       //GreBoxBoundary 0x4008
    C(0),                       //GreBoxBoth 0x4009
    C(0),                       //GrePolyFillet 0x400A
    C(0),                       //GrePolyFilletSharp 0x400B
    C(0),                       //GrePolySpline 0x400C
    C(0),                       //GreDrawConicsInPath 0x400D
    C(0),                       //GreCookWholePath 0x400E
    C(0),                       //GreCookPathCurves 0x400F
    C(0),                       // 0x4010
    C(0),                       //GreRenderPath 0x4011
#ifdef DCAF
//DCAF
    C(OpenScreenChangeArea),    //GreOpenScreenChangeArea 0x4012
//DCAF
    C(GetScreenChangeArea),     //GreGetScreenChangeArea       0x4013
//DCAF
    C(CloseScreenChangeArea),   //GreCloseScreenChangeArea     0x4014
//DCAF
#else
//DCAF
   C(0),                       // 0x4012
   C(0),                       // 0x4013
   C(0),                       // 0x4014
#endif
//DCAF
    C(0),                        // 0x4015
    T(eddl_PolyLine),            //GreDisjointLines 0x4016
    T(eddl_GetCurrentPosition),  //GreGetCurrentPosition 0x4017
    T(eddl_SetCurrentPosition),  //GreSetCurrentPosition 0x4018
    T(eddl_PolyLine),            //GrePolyLine 0x4019
    T(eddl_DrawLinesInPath),     //GreDrawLinesInPath 0x401A
    T(eddl_PolyShortLine),       //GrePolyShortLine 0x401B
    T(edds_PolyScanLine),        //GrePolyScanline 0x401C
#ifdef DCAF
//DCAF
     T(GetScreenBits),            //GreGetScreenBits 0x401D
//DCAF
     T(SetScreenBits),            //GreSetScreenBits 0x401E
//DCAF
#else //DCAF
     C(0),                        //0x401D
     C(0),                        //0x401E
#endif //DCAF
     C(0),                        //0x401F
     C(0),                        //0x4020
     C(0),                        //0x4021
     T(DrawBits),                 //GreDrawBits 0x6022
     T(eddb_DeviceCreatebit map), //GreDeviceCreatebit map 0x6023
     T(eddb_DeviceDeletebit map), //GreDeviceDeletebit map 0x4024
     C(eddb_DeviceSelectbit map), //GreDeviceSelectbit map 0x4025
     T(eddb_BitBlt),              //GreBitblt 0x6026
     T(eddb_GetPel),              //GreGetPel 0x6027
     T(eddb_SetPel),              //GreSetPel 0x4028
     T(eddb_ImageData),           //GreImageData 0x4029
     T(ScanLR),                   //GreScanLR 0x602A
     C(0),                        //GreFloodFill 0x602B
     T(eddm_SaveScreenBits),      //GreSaveScreenBits 0x402C
     C(0),                        //GreRestoreScreenBits 0x402D
     T(eddb_DrawBorder),          //GreDrawBorder 0x602E
     T(eddm_DeviceSetCursor),     //GreDeviceSetCursor 0x402F
     T(Getbit mapBits),           //GreGetbit mapBits 0x6030
     T(Setbit mapBits),           //GreSetbit mapBits 0x6031
     T(eddm_SetColorCursor),      //GreSetColorCursor 0x4032
     C(0),                        //0x4033
     C(0),                        //0x4034
     T(eddt_CharString),          //GreCharString 0x5035
     T(eddt_CharStringPos),       //GreCharStringPos 0x7036
     C(0),                        //GreQueryTextBox 0x5037
     C(0),                        //GreQueryCharPositions 0x5038
     C(0),                        //GreQueryWidthTable 0x5039
     T(eddb_PolyMarker),          //GrePolyMarker 0x403A
     T(eddv_CharRect),            //GreCharRect 0x403B
     T(eddv_CharStr),             //GreCharStr 0x403C
     T(eddv_ScrollRect),          //GreScrollRect 0x403D
     T(eddv_UpdateCursor),        //GreUpdateCursor 0x403E
     C(0),                        //0x403F
     C(0),                        //0x4040
     C(0),                        //0x4041
     C(0),                        //0x4042
     C(0),                        //0x4043
     C(0),                        //0x4044
     C(0),                        //0x4045
     C(0),                        //GreBeginArea 0x4046
     C(0),                        //GreEndArea 0x4047
     C(0),                        //GreBeginPath 0x4048
     C(0),                        //GreEndPath 0x4049
     C(0),                        //GreCloseFigure 0x404A
     C(0),                        //GreFillPath 0x404B
     C(0),                        //GreOutlinePath 0x404C
     C(0),                           //GreModifyPath 0x404D
     C(0),                           //GreStrokePath 0x404E
     C(0),                           //GreSelectClipPath 0x404F
     C(0),                           //GreSavePath 0x4050
     C(0),                           //GreRestorePath 0x4051
     C(0),                           //GreClip1DPath 0x4052
     C(0),                           //GreDrawRawPath 0x4053
     C(0),                           //GreDrawCookedPath 0x4054
     C(0),                           //GreAreaSetAttributes 0x6055
     C(0),                           //GrePolygon 0x4056
     C(0),                           //GrePathToRegion 0x4057
     C(0),                           //GreDrawRLE 0x4058
     C(0),                           // 0x4059
     C(0),                           // 0x405A
     C(0),                           // 0x405B
     C(0),                           // 0x405C
     C(0),                           //GreGetRegionBox 0x405D
     C(0),                           //GreGetRegionRects 0x405E
     C(0),                           //GreOffsetRegion 0x405F
     C(0),                           //GrePtInRegion 0x4060
     C(0),                           //GreRectInRegion 0x4061
     C(0),                           //GreCreateRectRegion 0x4062
     C(0),                           //GreDestroyRegion 0x4063
     C(0),                           //GreSetRectRegion 0x4064
     C(0),                           //GreCombineRegion 0x4065
     C(0),                           //GreCombineRectRegion 0x4066
     C(0),                           //GreCombineShortLineRegion 0x4067
     C(0),                           //GreEqualRegion 0x4068
     C(0),                           //GrePaintRegion 0x4069
     C(0),                           //GreSetRegionOwner 0x406A
     C(0),                           //GreFrameRegion 0x406B
     C(0),                           // 0x406C
     C(0),                           // 0x406D
     C(0),                           //GreGetClipBox 0x406E
     C(0),                           //GreGetClipRects 0x406F
     C(0),                           //GreOffsetClipRegion 0x4070
     C(0),                           //GrePtVisible 0x4071
     C(0),                           //GreRectVisible 0x4072
     C(0),                           //GreQueryClipRegion 0x4073
     C(0),                           //GreSelectClipRegion 0x4074
     C(0),                           //GreIntersectClipRectangle 0x4075
     C(0),                           //GreExcludeClipRectangle 0x4076
     C(0),                           //GreSetXformRect 0x4077
     C(0),                           // 0x4078
     C(0),                           // 0x4079
     C(0),                           // 0x407A
     C(0),                           //GreSaveRegion 0x407B
     C(0),                           //GreRestoreRegion 0x407C
     C(0),                           //GreClipPathCurves 0x407D
     C(0),                           //GreSelectPathRegion 0x407E
     C(0),                           //GreRegionSelectbitmap 0x407F
     C(0),                           //GreCopyClipRegion 0x4080
     C(0),                           //GreSetupDC 0x4081
     C(0),                           //0x4082
     C(0),                           //GreGetPageUnits 0x4083
     C(0),                           //GreSetPageUnits 0x4084
     C(0),                           //GreGetModelXform 0x4085
     C(0),                           //GreSetModelXform 0x4086
     C(0),                           //GreGetWindowViewportXform 0x4087
     C(0),                           //GreSetWindowViewportXform 0x4088
     C(0),                           //GreGetGlobalViewingXform 0x4089
     C(0),                           //GreSetGlobalViewingXform 0x408A
     C(0),                           //GreSaveXformData 0x408B
     C(0),                           //GreRestoreXformData 0x408C
     C(0),                           //GreGetPageViewport 0x408D
     C(0),                           //GreSetPageViewport 0x408E
     C(0),                           // 0x408F
     C(0),                           // 0x4090
     C(0),                           //GreGetGraphicsField 0x4091
     C(0),                           //GreSetGraphicsField 0x4092
     C(0),                           //GreGetViewingLimits 0x4093
     C(0),                           //GreSetViewingLimits 0x4094
     C(0),                           //GreQueryViewportSize 0x4095
     C(0),                           //GreConvert 0x4096
     C(0),                           //GreConvertPath 0x4097
     C(0),                           //GreSaveXform 0x4098
     C(0),                           //GreRestoreXform 0x4099
     C(0),                           //GreMultiplyXforms 0x409A
     C(0),                           //GreConvertWithMatrix 0x409B
     C(0),                           //0x409C
     T(edda_DeviceGetAttributes),    //GreDeviceGetAttributes 0x609D
     T(eddv_DeviceSetAVIOFont2),     //GreDeviceSetAVIOFont2 0x409E
     C(0),                           // 0x409F
     T(edda_GetPairKerningTable),    //GreGetPairKerningTable 0x40A0
     C(0),                           //GreDeviceSetAVIOFont 0x40A1
     T(edda_DeviceSetAttributes),    //GreDeviceSetAttributes 0x60A2
     C(edda_DeviceSetGlobalAttribute),//GreDeviceSetGlobalAttribute 0x60A3
     C(edda_NotifyClipChange),        //GreNotifyClipChange 0x40A4
     T(eddm_NotifyTransformChange),   //GreNotifyTransformChange 0x40A5
     T(edda_RealizeFont),             //GreRealizeFont 0x40A6
     T(eddm_ErasePS),                 //GreErasePS 0x40A7
     T(eddl_SetStyleRatio),           //GreSetStyleRatio 0x40A8
     T(edda_DeviceQueryFontAttributes),//GreDeviceQueryFontAttributes 0x40A9
     T(edda_DeviceQueryFonts),        //GreDeviceQueryFonts 0x40AA
     C(edda_DeviceInvalidateVisRegion),//GreDeviceInvalidateVisRegion 0x40AB
     T(eddg_GetPickWindow),           //GreGetPickWindow 0x40AC
     T(eddg_SetPickWindow),           //GreSetPickWindow 0x40AD
     C(eddg_ResetBounds),             //GreResetBounds 0x40AE
     C(eddg_GetBoundsData),           //GreGetBoundsData 0x40AF
     C(eddg_AccumulateBounds),        //GreAccumulateBounds 0x40B0
     C(0),                            //GreGetExtraError 0x40B1
     C(0),                            //GreSetExtraError 0x40B2
     C(edda_GetCodePage),             //GreGetCodePage 0x40B3
     C(edda_SetCodePage),             //GreSetCodePage 0x40B4
     C(eddm_LockDevice),              //GreLockDevice 0x40B5
     C(eddm_UnlockDevice),            //GreUnlockDevice 0x40B6
     T(eddm_Death),                   //GreDeath 0x40B7
     T(eddm_Resurrection),            //GreResurrection 0x40B8
     C(0),                            // 0x40B9
     C(edda_GetDCOrigin),             //GreGetDCOrigin 0x40BA
     C(edda_DeviceSetDCOrigin),       //GreDeviceSetDCOrigin 0x40BB
     T(eddl_GetLineOrigin),           //GreGetLineOrigin 0x40BC
     T(eddl_SetLineOrigin),           //GreSetLineOrigin 0x40BD
     T(eddl_GetStyleRatio),           //GreGetStyleRatio 0x40BE
     C(0),                            // 0x40BF
     C(0),                            // 0x40C0
     C(0),                            // 0x40C1
     C(0),                            // 0x40C2
     T(eddc_QueryColorData),          //GreQueryColorData 0x60C3
     T(eddc_QueryLogColorTable),      //GreQueryLogColorTable 0x60C4
     C(eddc_CreateLogColorTable),     //GreCreateLogColorTable 0x60C5
     T(eddc_RealizeColorTable),       //GreRealizeColorTable 0x60C6
     T(eddc_UnrealizeColorTable),     //GreUnrealizeColorTable 0x60C7
     T(eddc_QueryRealColors),         //GreQueryRealColors 0x40C8
     C(eddc_QueryNearestColor),       //GreQueryNearestColor 0x40C9
     C(eddc_QueryColorIndex),         //GreQueryColorIndex 0x60CA
     T(eddc_QueryRGBColor),           //GreQueryRGBColor 0x60CB
     C(0),                            // 0x40CC
     C(0),                            // 0x40CD
     C(0),                            // 0x40CE
     C(0),                            // 0x40CF
     T(eddq_QueryDevicebit maps),     //GreQueryDevicebitmaps 0x40D0
     C(eddq_QueryDeviceCaps),         //GreQueryDeviceCaps 0x40D1
     T(eddq_Escape),                  //GreEscape 0x40D2
     T(eddq_QueryHardcopyCaps),       //GreQueryHardcopyCaps 0x40D3
     C(eddm_QueryDevResource),        //GreQueryDevResource2 0x40D4
     T(DeviceCreatePalette),          //GreDeviceCreatePalette 0x40D5
     T(DeviceDeletePalette),          //GreDeviceDeletePalette 0x40D6
     T(DeviceSetPaletteEntries),      //GreDeviceSetPaletteEntries 0x40D7
     T(DeviceAnimatePalette),         //GreDeviceAnimatePalette 0x40D8
     T(DeviceResizePalette),          //GreDeviceResizePalette 0x40D9
     T(RealizePalette),               //GreRealizePalette 0x40DA
     T(QueryHWPaletteInfo),           //GreQueryHWPaletteInfo 0x40DB
     T(UpdateColors),                 //GreUpdateColors 0x40DC
     T(QueryPaletteRealization),      //GreQueryPaletteRealization 0x40DD
     C(0),                            //GreGetVisRects 0x40DE
     C(0),                            //GreDevicePolySet 0x40DF
};

The T() macro in EDDEFLDB.C appends a "T" to the beginning of each entry point in the driver. The trace macro in PMTRACE.ASM creates a function with the name "T", in addition to the name of the real entry point. It then prints out the name of the entry point with Debug32Output, and then calls the real driver entry point.

If your driver supports cached-monochrome bit maps, it is often difficult to determine if the code is working correctly while copying monochrome bit maps from system memory to the display in eddh_SrcDestBlt. Almost any monochrome bit map will fit into the bit-map cache, because monochrome bit maps are small. As a result, the code in eddh_SrcDestBlt that handles monochrome data is rarely used. Disabling monochrome bit-map caching in PixBltThroughClipsViaPhunk() enables you to exercise the monochrome code in eddh_SrcDestBlt. (See BitBlt for further details on this.)

If you are familiar with the Microsoft** Windows** debugger, wdeb386, the OS/2 kernel debugger lacks a "z" command to eliminate INT 3 instructions you added to your code but now no longer need. The command e eip 90;gwill perform the same function (by examining the byte at the current instruction pointer, replacing it with a no-op, and restarting execution). Another useful debugger command is the DPcommand, which dumps the page tables. This is useful for getting information about any linear pointer to the video aperture that you obtain.

S3.DSP (Sample File for Installation and Configuration)

The following is an example S3 display (DSP) file containing DSPINSTL installation and configuration commands.

:TITLE

S3 DSP
:KEY
S3
:FILES :MODE=PRIMARY
S3VIDEO %BOOTDRIVE%:
DISPLAY.DL_ %BOOTDRIVE%:
PMVIOP.DL_ %BOOTDRIVE%:

*:FILES :MODE=PRIMARY :MODE=DOS

:FILES :MODE=PRIMARY :MODE=WINDOWS
S3WIN    %WINPATH%\SYSTEM

:CONFIG :MODE=PRIMARY
DEVINFO=SCR,VGA,%BOOTDRIVE%: \OS2\VIOTBL.DCP
SET VIDEO_DEVICES=VIO_SVGA
SET VIO_SVGA=DEVICE(BVHVGA,BVHSVGA)

:CONFIG :MODE=PRIMARY :MODE=BIDI
SET VIO_VGA=DEVICE(BVHVGA,BDBVH)

:CONFIG :MODE=PRIMARY :MODE=DOS
DEVICE=%BOOTDRIVE%: \OS2\MDOS\VSVGA.SYS

:DEL_CONFIG_LINE :MODE=PRIMARY

*:DELETING XGA LINES FOR PROTECT MODE
DEVICE=%BOOTDRIVE%: \OS2\XGARING0.SYS
DEVICE=%BOOTDRIVE%: \OS2\XGA.SYS
BASEDEV=XGA.SYS
SET VIO_XGA=DEVICE(BVHVGA,BVHXGA)

*:DELETING BGA LINES FOR PROTECT MODE
DEVINFO=SCR,BGA,%BOOTDRIVE%: \OS2\VIOTBL.DCP
SET VIO_8514A=DEVICE(BVHVGA,BVH8514A)

*:DELETING CGA LINES FOR PROTECT MODE
DEVINFO=SCR,EGA,%BOOTDRIVE%: \OS2\VIOTBL.DCP
SET VIO_CGA=DEVICE(BVHCGA)

*:DELETING EGA LINES FOR PROTECT MODE
DEVINFO=SCR,EGA,%BOOTDRIVE%: \OS2\VIOTBL.DCP
SET VIO_EGA=DEVICE(BVHEGA)

*:DELETING VGA LINES FOR PROTECT MODE
DEVICE=%BOOTDRIVE%: \OS2\MDOS\VVGA.SYS
SET VIO_VGA=DEVICE(BVHVGA)

:DEL_CONFIG_LINE :MODE=PRIMARY :MODE=DOS

*:DELETING S3 CORPS. DRIVERS STATEMENT FOR REAL MODE
DEVICE=%BOOTDRIVE%: \OS2\MDOS\VENH.SYS

*:DELETING XGA LINES FOR REAL MODE
DEVICE=%BOOTDRIVE%: \OS2\MDOS\VXGA.SYS

*:DELETING BGA LINES FOR REAL MODE
DEVICE=%BOOTDRIVE%: \OS2\MDOS\V8514A.SYS

*:DELETING CGA LINES FOR REAL MODE
DEVICE=%BOOTDRIVE%: \OS2\MDOS\VCGA.SYS

*:DELETING EGA LINES FOR REAL MODE
DEVICE=%BOOTDRIVE%: \OS2\MDOS\VEGA.SYS

*:DELETING VGA LINES FOR REAL MODE
DEVICE=%BOOTDRIVE%: \OS2\MDOS\VVGA.SYS

:OS2INI :MODE=PRIMARY
%BOOTDRIVE%: \OS2\INSTALL\REINSTAL.INI
InstallWindow VIOADAPTERSTR 7

:OS2INI :MODE=SECONDARY
%BOOTDRIVE%: \OS2\INSTALL\REINSTAL.INI
InstallWindow VIOADAPTER2STR 7

:OS2INI :MODE=PRIMARY
OS2.INI
PM_DISPLAYDRIVERS  IBMS332        IBMS332
PM_DISPLAYDRIVERS  CURRENTDRIVER  IBMS332
PM_DISPLAYDRIVERS  DEFAULTDRIVER  IBMS332
PM_Fonts           COURIERI
PM_Fonts           HELVI
PM_Fonts           TIMESI

*
Note:  win.ini font statements are missing. Should   be included if
* font support for 1024x768 and 1280x768 is the same as XGA's.

:OS2INI :MODE=PRIMARY :MODE=WINDOWS
OS2.INI
PM_DISPLAYDRIVERS RESOLUTION_CHANGED 1
WIN_RES_640x480x16     WIN_RES_SET    WIN_RES_S3_0
WIN_RES_640x480x256    WIN_RES_SET    WIN_RES_S3_1
WIN_RES_640x480x65536  WIN_RES_SET    WIN_RES_S3_2
WIN_RES_800x600x256    WIN_RES_SET    WIN_RES_S3_3
WIN_RES_800x600x65536  WIN_RES_SET    WIN_RES_S3_4
WIN_RES_1024x768x256   WIN_RES_SET    WIN_RES_S3_5
WIN_RES_1024x768x65536 WIN_RES_SET    WIN_RES_S3_6
WIN_RES_1280x1024x256  WIN_RES_SET    WIN_RES_S3_7
WIN_RES_640x480x16777216  WIN_RES_SET    WIN_RES_S3_8
WIN_RES_S3_0   1  "system.ini boot sdisplay.drv swinvga.drv"
WIN_RES_S3_0   2  "system.ini boot display.drv vga.drv"
WIN_RES_S3_0   3  "system.ini boot fonts.fon vgasys.fon"
WIN_RES_S3_0   4  "system.ini boot fixedfon.fon vgafix.fon"
WIN_RES_S3_0   5  "system.ini boot oemfonts.fon vgaoem.fon"
WIN_RES_S3_0   6  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_0   7  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_0   8  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_0   9  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_0  10  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_0  11  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_0  12  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_0  13  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (VGA res)\"  sserife.fon"
WIN_RES_S3_0  14  "win.ini fonts \"Courier 10,12,15 (VGA res)\"                coure.fon"
WIN_RES_S3_0  15  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (VGA res)\"       serife.fon" 
WIN_RES_S3_0  16  "win.ini fonts \"Symbol 8,10,12,14,18,24(VGA res)\"          symbole.fon"
WIN_RES_S3_0  17  "win.ini fonts \"Small Fonts (VGA res)\"                     smalle.fon"
WIN_RES_S3_1   1  "system.ini boot sdisplay.drv swins3.drv"
WIN_RES_S3_1   2  "system.ini boot display.drv s3.drv"
WIN_RES_S3_1   3  "system.ini boot fonts.fon vgasys.fon"
WIN_RES_S3_1   4  "system.ini boot fixedfon.fon vgafix.fon"
WIN_RES_S3_1   5  "system.ini boot oemfonts.fon vgaoem.fon"
WIN_RES_S3_1   6  "system.ini boot.description display.drv 640x480"
WIN_RES_S3_1   7  "system.ini boot.description sdisplay.drv 640x480"
WIN_RES_S3_1   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_1   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_1  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_1  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_1  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_1  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_1  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_1  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (VGA res )\" sserife.fon"
WIN_RES_S3_1  16  "win.ini fonts \"Courier 10,12,15 (VGA res)\"                coure.fon"
WIN_RES_S3_1  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (VGA res)\"       serife.fon"
WIN_RES_S3_1  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (VGA res)\"         symbole.fon"
WIN_RES_S3_1  19  "win.ini fonts \"Small Fonts (VGA res)\"smalle.fon"
WIN_RES_S3_2   1  "system.ini boot sdisplay.drv swins316.drv"
WIN_RES_S3_2   2  "system.ini boot display.drv s316.drv"
WIN_RES_S3_2   3  "system.ini boot fonts.fon vgasys.fon"
WIN_RES_S3_2   4  "system.ini boot fixedfon.fon vgafix.fon"
WIN_RES_S3_2   5  "system.ini boot oemfonts.fon vgaoem.fon"
WIN_RES_S3_2   6  "system.ini boot.description display.drv 640x480x64K"
WIN_RES_S3_2   7  "system.ini boot.description sdisplay.drv 640x480x64K"
WIN_RES_S3_2   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_2   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_2  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_2  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_2  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_2  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_2  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_2  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (VGA res)\"  sserife.fon"
WIN_RES_S3_2  16  "win.ini fonts \"Courier 10,12,15 (VGA res)\"                coure.fon"
WIN_RES_S3_2  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (VGA res)\"       serife.fon"
WIN_RES_S3_2  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (VGA res)\"         symbole.fon"
WIN_RES_S3_2  19  "win.ini fonts \"Small Fonts (VGA res)\"smalle.fon"
WIN_RES_S3_3   1  "system.ini boot sdisplay.drv swins3.drv"
WIN_RES_S3_3   2  "system.ini boot display.drv s3.drv"
WIN_RES_S3_3   3  "system.ini boot fonts.fon vgasys.fon"
WIN_RES_S3_3   4  "system.ini boot fixedfon.fon vgafix.fon"
WIN_RES_S3_3   5  "system.ini boot oemfonts.fon vgaoem.fon"
WIN_RES_S3_3   6  "system.ini boot.description display.drv 800x600 " 
WIN_RES_S3_3   7  "system.ini boot.description sdisplay.drv 800x600 " 
WIN_RES_S3_3   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_3   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_3  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_3  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_3  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_3  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_3  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_3  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (VGA res)\"  sserife.fon"
WIN_RES_S3_3  16  "win.ini fonts \"Courier 10,12,15 (VGA res)\"                 coure.fon"
WIN_RES_S3_3  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (VGA res)\"       serife.fon"
WIN_RES_S3_3  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (VGA res)\"         symbole.fon"
WIN_RES_S3_3  19  "win.ini fonts \"Small Fonts (VGA res)\"smalle.fon"
WIN_RES_S3_4   1  "system.ini boot sdisplay.drv swins316.drv"
WIN_RES_S3_4   2  "system.ini boot display.drv s316.drv"
WIN_RES_S3_4   3  "system.ini boot fonts.fon vgasys.fon"
WIN_RES_S3_4   4  "system.ini boot fixedfon.fon vgafix.fon"
WIN_RES_S3_4   5  "system.ini boot oemfonts.fon vgaoem.fon"
WIN_RES_S3_4   6  "system.ini boot.description display.drv 800x600x64K " 
WIN_RES_S3_4   7  "system.ini boot.description sdisplay.drv 800x600x64K " 
WIN_RES_S3_4   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_4   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_4  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_4  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_4  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_4  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_4  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_4  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (VGA res)\"  sserife.fon"
WIN_RES_S3_4  16  "win.ini fonts \"Courier 10,12,15 (VGA res)\"                coure.fon"
WIN_RES_S3_4  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (VGA res)\"       serife.fon"
WIN_RES_S3_4  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (VGA res)\"         symbole.fon"
WIN_RES_S3_4  19  "win.ini fonts \"Small Fonts (VGA res)\" smalle.fon"
WIN_RES_S3_5   1  "system.ini boot sdisplay.drv swins3.drv"
WIN_RES_S3_5   2  "system.ini boot display.drv s3.drv"
WIN_RES_S3_5   3  "system.ini boot fonts.fon xgasys.fon"
WIN_RES_S3_5   4  "system.ini boot fixedfon.fon xgafix.fon"
WIN_RES_S3_5   5  "system.ini boot oemfonts.fon xgaoem.fon"
WIN_RES_S3_5   6  "system.ini boot.description display.drv 1024x768 " 
WIN_RES_S3_5   7  "system.ini boot.description sdisplay.drv 1024x768 " 
WIN_RES_S3_5   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_5   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_5  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_5  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_5  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_5  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_5  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_5  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (XGA res)\"  sserifg.fon"
WIN_RES_S3_5  16  "win.ini fonts \"Courier 10,12,15 (XGA res)\"                courg.fon"
WIN_RES_S3_5  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (XGA res)\"       serifg.fon"
WIN_RES_S3_5  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (XGA res)\"         symbolg.fon"
WIN_RES_S3_5  19  "win.ini fonts \"Small Fonts (XGA res)\"smallg.fon"
WIN_RES_S3_6   1  "system.ini boot sdisplay.drv swins316.drv"
WIN_RES_S3_6   2  "system.ini boot display.drv s316.drv"
WIN_RES_S3_6   3  "system.ini boot fonts.fon xgasys.fon"
WIN_RES_S3_6   4  "system.ini boot fixedfon.fon xgafix.fon"
WIN_RES_S3_6   5  "system.ini boot oemfonts.fon xgaoem.fon"
WIN_RES_S3_6   6  "system.ini boot.description display.drv 1024x768x64K"
WIN_RES_S3_6   7  "system.ini boot.description sdisplay.drv 1024x768x64K"
WIN_RES_S3_6   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_6   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_6  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_6  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_6  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_6  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_6  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_6  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (XGA res)\"  sserifg.fon"
WIN_RES_S3_6  16  "win.ini fonts \"Courier 10,12,15 (XGA res)\"                courg.fon"
WIN_RES_S3_6  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (XGA res)\"       serifg.fon"
WIN_RES_S3_6  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (XGA res)\"         symbolg.fon"
WIN_RES_S3_6  19  "win.ini fonts \"Small Fonts (XGA res)\" smallg.fon"
WIN_RES_S3_7   1  "system.ini boot sdisplay.drv ss31280.drv"
WIN_RES_S3_7   2  "system.ini boot display.drv s31280.drv"
WIN_RES_S3_7   3  "system.ini boot fonts.fon xgasys.fon"
WIN_RES_S3_7   4  "system.ini boot fixedfon.fon xgafix.fon"
WIN_RES_S3_7   5  "system.ini boot oemfonts.fon xgaoem.fon"
WIN_RES_S3_7   6  "system.ini boot.description display.drv 1280x1024"
WIN_RES_S3_7   7  "system.ini boot.description sdisplay.drv 1280x1024"
WIN_RES_S3_7   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_7   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_7  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_7  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_7  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_7  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_7  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_7  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (XGA res)\"  sserifg.fon"
WIN_RES_S3_7  16  "win.ini fonts \"Courier 10,12,15 (XGA res)\"                courg.fon"
WIN_RES_S3_7  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (XGA res)\"       serifg.fon"
WIN_RES_S3_7  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (XGA res)\"         symbolg.fon"
WIN_RES_S3_7  19  "win.ini fonts \"Small Fonts (XGA res)\"smallg.fon"
WIN_RES_S3_8   1  "system.ini boot sdisplay.drv swins324.drv"
WIN_RES_S3_8   2  "system.ini boot display.drv s324.drv"
WIN_RES_S3_8   3  "system.ini boot fonts.fon vgasys.fon"
WIN_RES_S3_8   4  "system.ini boot fixedfon.fon vgafix.fon"
WIN_RES_S3_8   5  "system.ini boot oemfonts.fon vgaoem.fon"
WIN_RES_S3_8   6  "system.ini boot.description display.drv 640x480x16M"
WIN_RES_S3_8   7  "system.ini boot.description sdisplay.drv 640x480x16M"
WIN_RES_S3_8   8  "win.ini fonts \"Symbol %ANYSTRING%\"
WIN_RES_S3_8   9  "win.ini fonts \"Helv %ANYSTRING%\"
WIN_RES_S3_8  10  "win.ini fonts \"Tms Rmn %ANYSTRING%\"
WIN_RES_S3_8  11  "win.ini fonts \"Courier %ANYSTRING%\"
WIN_RES_S3_8  12  "win.ini fonts \"MS Sans Serif %ANYSTRING%\"
WIN_RES_S3_8  13  "win.ini fonts \"MS Serif %ANYSTRING%\"
WIN_RES_S3_8  14  "win.ini fonts \"Small Fonts %ANYSTRING%\"
WIN_RES_S3_8  15  "win.ini fonts \"MS Sans Serif 8,10,12,14,18,24 (VGA res)\"  sserife.fon"
WIN_RES_S3_8  16  "win.ini fonts \"Courier 10,12,15 (VGA res)\"                coure.fon"
WIN_RES_S3_8  17  "win.ini fonts \"MS Serif 8,10,12,14,18,24 (VGA res)\"       serife.fon"
WIN_RES_S3_8  18  "win.ini fonts \"Symbol 8,10,12,14,18,24 (VGA res)\"         symbole.fon"
WIN_RES_S3_8  19  "win.ini fonts \"Small Fonts (VGA res)\"  smalle.fon"

* customize icon spacing for s3 below
:WININI :MODE=PRIMARY :MODE=WINDOWS
WIN.INI
Desktop IconSpacing 100

* system.ini entries below will be overwritten by graphics engine after reboot.
* if graphic engine fails to update system.ini for some reason
* the entries below are the default.

:WININI :MODE=PRIMARY :MODE=WINDOWS
SYSTEM.INI
boot sdisplay.drv swins3.drv
boot display.drv s3.drv
boot fonts.fon xgasys.fon 
boot fixedfon.fon xgafix.fon
boot oemfonts.fon xgaoem.fon
boot.description display.drv 1024x768
boot.description sdisplay.drv 1024x768

Deciphering File Names in the S3 Driver

Many of the files in the S3 driver are derived from the original XGA driver for OS/2.

Most of the modules in the XGA driver are named some variant of "edd*.*". The "edd" portion of the file name stands for Expressway Device Driver. This is based on numerous references to the "Expressway hardware" found throughout the driver.

The fourth letter of the filename conveys useful information. In addition, the last four letters in the filename are typically a kind of abbreviation describing what the module actually does. The following is a list of prefixes:

eddb*.* - bitmap creation modules
eddbcrea.c - bitmap creation code
eddbdelt.c - bitmap deletion code
eddbsubrs.c - bitmap creation subroutines
eddbimag.c - image data function (doesn't fit the pattern)
eddc*.* - driver color support functions
eddcdith.c - dithering code
eddcctab.c - color table manipulation functions
edde*.* - initialization code
eddefldb.c - FillLogicalDeviceBlock a primary initialization entry point
eddefpdb.c - FillPhysicalDeviceBlock - another key part of the initialization code
eddesres.c - obtain desired resolution and color depth for the driver
eddf*.*, and ff*.asm - bitmap rendering code, "The MESS."
eddffast.asm - eddf_MESS - the top level of the software drawing code
ffbltsd.asm - MESS support for bitblt's involving source and destination
ffbltpd.asm - MESS support for bitblt's involving pattern and destination
ffbltd.asm - MESS support for bitblt's involving only the destination
eddh*.asm - hardware dependent modules. These are the primary files to be altered.
eddhbblt.asm - driver bitblt support
eddhgchs.asm - driver text output support
eddhline.asm - hardware line drawing support
eddl*.* - polyline and poly shortline code
eddlpoly - eddl_Polyline - entry point for the polyline function
eddm*.* - cursor code, and death and resurrection related code
eddmccrs.c - color cursor setup code
eddmcurs.c - monochrome cursor setup code
eddmdead.c - driver support for death and resurrection
eddn*.* - high-level driver entry points, and caching code
eddnbblt.c - driver entry point for bitblt
eddngchs.c - driver entry point for text output
eddncach.c - font and bitmap caching support code
eddq*.* - query functions
eddqsres.c - query driver for available resolutions
eddqesc.c - multimedia escape functions
eddv*.* - AVIO text functions
eddvsrec.c - eddv_ScrollRect - scroll a rectangle in an AVIO window

Color Palette Default Values

The following tables list the default values in each of the color palettes.

This table shows the default values for the VGA (4bpp) palette.
Index RRGGBB
0 000000
1 000080
2 008000
3 008080
4 800000
5 800080
6 808000
7 808080
8 CCCCCC
9 0000FF
10 00FF00
11 00FFFF
12 FF0000
13 FF00FF
14 FFFF00
15 FFFFFF
This table shows the default values for the XGA (8bpp) Palette.
Index RRGGBB
0 000000
1 800000
2 009200
3 808000
4 0000AA
5 800080
6 0092AA
7 C1C1C1
8 AAFFAA
9 AAB6FF
10 0049AA
11 0049FF
12 006D00
13 006D55
14 006DAA
15 006DFF
16 002400
17 009255
18 0024AA
19 0092FF
20 00B600
21 00B655
22 00B6AA
23 00B6FF
24 00DB00
25 00DB55
26 00DBAA
27 00DBFF
28 FFDBAA
29 00FF55
30 00FFAA
31 FFFFAA
32 2B0000
33 2B0055
34 2B00AA
35 2B00FF
36 2B2400
37 2B2455
38 2B24AA
39 2B24FF
40 2B4900
41 2B4955
42 2B49AA
43 2B49FF
44 2B6D00
45 2B6D55
46 2B6DAA
47 2B6DFF
48 2B9200
49 2B9255
50 2B92AA
51 2B92FF
52 2BB600
53 2BB655
54 2BB6AA
55 2BB6FF
56 2BDB00
57 2BDB55
58 2BDBAA
59 2BDBFF
60 2BFF00
61 2BFF55
62 2BFFAA
63 2BFFFF
64 550000
65 550055
66 5500AA
67 5500FF
68 552400
69 552455
70 5524AA
71 5524FF
72 554900
73 554955
74 5549AA
75 5549FF
76 556D00
77 556D55
78 556DAA
79 556DFF
80 559200
81 559255
82 5592AA
83 5592FF
84 55B600
85 55B655
86 55B6AA
87 55B6FF
88 55DB00
89 55DB55
90 55DBAA
91 55DBFF
92 55FF00
93 55FF55
94 55FFAA
95 55FFFF
96 000055
97 800055
98 002455
99 8000FF
100 802400
101 802455
102 8024AA
103 8024FF
104 804900
105 804955
106 8049AA
107 8049FF
108 806D00
109 806D55
110 806DAA
111 806DFF
112 080808
113 0F0F0F
114 171717
115 1F1F1F
116 272727
117 2E2E2E
118 363636
119 3E3E3E
120 464646
121 4D4D4D
122 555555
123 5D5D5D
124 646464
125 6C6C6C
126 747474
127 7C7C7C
128 FFDB00
129 8B8B8B
130 939393
131 9B9B9B
132 FFB6FF
133 AAAAAA
134 B2B2B2
135 B9B9B9
136 0024FF
137 CCCCCC
138 D1D1D1
139 D8D8D8
140 FFB6AA
141 E8E8E8
142 F0F0F0
143 F7F7F7
144 FFDBFF
145 809255
146 8092AA
147 8092FF
148 80B600
149 80B655
150 80B6AA
151 80B6FF
152 80DB00
153 80DB55
154 80DBAA
155 80DBFF
156 80FF00
157 80FF55
158 80FFAA
159 80FFFF
160 AA0000
161 AA0055
162 AA00AA
163 AA00FF
164 AA2400
165 AA2455
166 AA24AA
167 AA24FF
168 AA4900
169 AA4955
170 AA49AA
171 AA49FF
172 AA6D00
173 AA6D55
174 AA6DAA
175 AA6DFF
176 AA9200
177 AA9255
178 AA92AA
179 AA92FF
180 AAB600
181 AAB655
182 AAB6AA
183 004955
184 AADB00
185 AADB55
186 AADBAA
187 AADBFF
188 AAFF00
189 AAFF55
190 004900
191 AAFFFF
192 D50000
193 D50055
194 D500AA
195 D500FF
196 D52400
197 D52455
198 D524AA
199 D524FF
200 D54900
201 D54955
202 D549AA
203 D549FF
204 D56D00
205 D56D55
206 D56DAA
207 D56DFF
208 D59200
209 D59255
210 D592AA
211 D592FF
212 D5B600
213 D5B655
214 D5B6AA
215 D5B6FF
216 D5DB00
217 D5DB55
218 D5DBAA
219 D5DBFF
220 D5FF00
221 D5FF55
222 D5FFAA
223 D5FFFF
224 FFDB55
225 FF0055
226 FF00AA
227 FFFF55
228 FF2400
229 FF2455
230 FF24AA
231 FF24FF
232 FF4900
233 FF4955
234 FF49AA
235 FF49FF
236 FF6D00
237 FF6D55
238 FF6DAA
239 FF6DFF
240 FF9200
241 FF9255
242 FF92AA
243 FF92FF
244 FFB600
245 FFB655
246 E0E0E0
247 A2A2A2
248 838383
249 FF0000
250 00FF00
251 FFFF00
252 0000FF
253 FF00FF
254 00FFFF
255 FFFFFF
This table shows the default values for the XGA (16bpp) palette.
This is a 5-6-5 format,rrrrrggggggbbbbb, where: r = red (5 bits)
g = green (6 bits)
b = blue (5 bits)

This chapter is based in part on the VESA SVPMI (Video Electronics Standards Association Super VGA Protect Mode Interface) proposal.