Jump to content

New VGA Display Driver Entry Points

From EDM2

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation

IBM UNCLASSIFIED

Owner: Antonio Zivoli

Introduction

This document describes the additional entry points implementing the OS/2 Presentation Manager VGA Display Driver new capabilities to efficiently track the screen region that is updated by Presentation Manager drawing functions, and to handle the related data.

The VGA Display Driver modifications can be split into two distinct areas:

  1. Efficient accumulation maintenance and querying of clipped screen bounds of all screen drawing operations. This is provided by the following new functions:
    • OpenScreenChangeArea
    • GetScreenChangeArea
    • CloseScreenChangeArea.
  2. Compression of data from a specified area of the screen into a memory buffer, and (usually after transmission over a network) decompression of that data into an internal memory bitmap. Data conversion must be performed between the different data format (planar to packed), in order to allow the data interchange between VGA and XGA, and in general between all the Display Drivers exporting these entry points.
    This is provided by the following new functions:
    • GetScreenBits
    • SetScreenBits.

The majority of the new code is contained within new modules that are simply linked in with the base Driver.

The areas of the core Driver code that must be modified are:

  • the Dispatch Table, which contains the new Entry Points
  • all drawing functions, which contain extra tests to determine whether they should accumulate clipped screen bounds and, if necessary, code to calculate the clipped bounding rectangle
  • Seamless Windows cursor exclusion code, which is modified to call the new bounds accumulation routine.

Dispatch Table

In DISPATCH.ASM/IBMVGA32.DLL the 5 DCAF entry points get included into the Driver Dispatch Table:

  • OpenScreenChangeArea
  • GetScreenChangeArea
  • CloseScreenChangeArea
  • GetScreenBitsStub
  • SetScreenBitsStub

The first three entry points are provided by IBMVGA32.DLL; GetScreenBitsStub and SetScreenBitsStub (SCRAREA.ASM/IBMVGA32.DLL) stub out the routines GetScreenBits and SetScreenBits, which are actually included in IBMDEV32.DLL: their address is loaded at init time as follows. A trial (in INIT.ASM/IBMVGA32.DLL) to load the following routines:

  • GetScreenBits
  • SetScreenBits
  • UnPackBuffer

is done, using DosQueryProcAddr. If everything works OK, the fDCAFEnabled flag is set, otherwise the flag is set off, so indicating whether or not the driver supports these new features. Moreover, in DISPATCH.ASM/IBMVGA32.DLL the following functions:

  • SetRectRegion
  • GetRegionBox
  • GetRegionRects
  • CombineRectRegion

get included into the apfnDefDispatch table (this table defines the set of default dispatch functions that the display driver must call back to after processing a hooked function).

Screen bounds

"Traditional" Presentation Manager bounds are unclipped, and use a single rectangle to define their limits.

In order to better define the bounding area, the changed VGA Display Driver has the ability to maintain one or more clipped, multirectangle regions (or Screen Change Areas) that are updated to indicate areas on the screen that have been drawn to.

This ability is provided by the following changes to the base Driver:

  • Three new entry points:
  • OpenScreenChangeArea (Section 6.1)
  • GetScreenChangeArea (Section 6.2)
  • CloseScreenChangeArea (Section 6.3)
to create, query and delete Screen Change Areas.
  • A bounds accumulation function, which add a single rectangle to all of the currently active SCAs (Section 2.3).
  • An extra flag test in the path of every Display Driver drawing function (for COM_SCR_BOUND in the high-order word in FunN, the last parameter of the GreXX calls). If this flag is set and the drawing operation is going to the screen, then the drawing function passes a clipped bounding rectangle of the drawing primitive to the bounds accumulation functions described above. The code required to do this is very Driver specific (Section 2.3).
  • Interception of Windows cursor exclusion calls and passing the supplied exclusion rectangle to the new bounds accumulation function (Section 5.0)

Compression/decompression

The bounding function described above efficiently tracks the regions on the screen that are updated by Presentation Manager drawing functions.

The VGA Display Driver also provides the ability to compress, decompress and (if necessary) convert the format of screen data.

This ability is provided by two new entry points:

  • GetScreenBits (Section 6.4)
  • SetScreenBits (Section 6.5).

Screen Change Areas

SCA definition

A key part of the modifications concerns the maintenance and tracking of Screen Change Areas. These Areas track regions of the screen that are altered by Display Driver drawing routines as efficiently as possible. This is done by using multiple rectangles to define the area, rather than just the usual single bounding rectangle provided by "standard" Presentation Manager bounding functions.

Screen Change Areas are maintained within the Driver in a SCA structure, defined in the new DCAF.INC file:

 SCA     STRUC
 sca_next     DD    ?                                  ; Linked list pointer
 sca_cRects   DD    ?                                  ; No. rects in area
 sca_rect     DB    SIZE RECTL * NUM_SCA_PLUS1 DUP (?) ; Rectangle dimensions
 sca_size     DD    NUM_SCA_PLUS1 DUP (?)              ; Cached rectangle sizes
 SCA     ENDS
 typedef struct _SCA { /* sca */

Instances of this structure are dynamically created/destroyed upon calls to OpenScreenChangeArea/CloseScreenChangeArea.

A global variable, pStartSCA, points to the latest created SCA instance. If pStartSCA is null then there are no active Screen Change Areas.

All SCA instances are linked together in a list using the sca_next field of the SCA structure. A null value in this field indicates the end of the list (first created SCA). For example:

Memory loc.
              -------------------    pStartSCA = 250;
              | sca_next = 200  |
              |                 |
              |      4th SCA    |
   0x250      -------------------

              -------------------
              | sca_next = 150  |
              |                 |
              |      3th SCA    |
   0x200      -------------------

              -------------------
              | sca_next = 100  |
              |                 |
              |      2nd SCA    |
   0x150      -------------------

              -------------------
              | sca_next = 0    |
              |                 |
              |      1st SCA    |
   0x100      -------------------

Each SCA instance can store multiple rectangles, up to NUM_SCA_RECTS (14), which define the area on the screen that has changed since the SCA was created or last queried. These rectangles are stored in the array sca_rect.

The number of rectangles stored within the array is kept in the sca_cRects field (which will never exceed NUM_SCA_RECTS). If sca_cRects is zero then the SCA is a null area - the initial state.

The remaining field in the structure, sca_size, is an array containing the sizes of the rectangles in sca_rect. This is not strictly necessary, because the sizes can be calculated on the fly (using the dimensions in sca_rect). However, when accumulating a rectangle into a Screen Change Area, the size of each of the rectangles is frequently needed. Caching the rectangle sizes in this array saves us having to recalculate the sizes every time, resulting in better performance.

The SCA structure defines space for NUM_SCA_PLUS1 (15) rectangles, but only NUM_SCA_RECTS (14) are ever used to define the SCA. The extra rectangle is used to simplify the routine that accumulates rectangles into the SCA (see Section 2.3).

Creating a new SCA

This task is accomplished by OpenScreenChangeArea (SCRAREA.C/IBMVGA32.DLL). To create a new SCA:

  • Allocate memory for the new SCA instance.
  • Set the sca_cRects field to be zero.
  • Set the pStartSca global variable to point to the new SCA address.
  • Link the instance into the linked list of SCAs.

The new created SCA will be identified by a 32 bit handle, actually the address of the SCA location in the Display Driver.

Accumulating a rectangle into a SCA

All Display Driver functions that draw to the screen are modified to accumulate clipped bounding rectangles into all active SCAs when necessary. The drawing functions determine whether they should do this by examining the FunN (Function Number/COM_flags) parameter. If the COM_SCR_BOUND flag is set, and the function is drawing to the screen, then bounding rectangles are accumulated into the active SCAs; otherwise COM_SCR_BOUND will not be set and the only difference in operation/performance of the VGA base Driver will be one additional check of the COM_SCR_BOUND flag per drawing function i.e. negligible.

The setting of COM_SCR_BOUND is controlled by the Open/CloseScreenChangeArea functions (see Section 5 for details).

Let's see how the drawing calls actually do the bounds accumulation:

  • Some of the drawing calls perform the bound accumulation by calling InnerAccumulateBounds (BOUNDS.ASM/IBMVGA.DLL): this routine - after clipping the passed coordinates (if needed: i.e., PolyLine, DisjointLines and DrawLinesInPath provide unclipped rectangles) - calls the new bounds accumulation routine, AccumulateScreenBound (in SCBOUNDS.ASM/IBMVGA32.DLL).
  • Some other drawing calls do actual drawing under the control of enum_clip_rects (ENUMCLIP.ASM/IBMVGA32.DLL): for these calls, the bounding accumulation routine AccumulateScreenBound is called from enum_clip_rects.

The AccumulateScreenBound routine (in SCBOUNDS.ASM/IBMVGA32.DLL) performs the bounding accumulation: its task is to take the passed rectangle, and accumulates it into all the current SCAs.

The passed rectangle is in inclusive SCREEN coords.

When a rectangle is added into a Screen Change Area, it is done so in such a way as to minimise the increase in area of the SCA. The following algorithm does this:

for (pscaCurrent = each SCA in the linked list)
:
: // First check whether the new rect is already contained within this SCA
: for (rclCurrent = each rectangle in current SCA)
: : if rclNew lies completely within rclCurrent
: : : no more work - skip straight to next SCA
: : endif
: endfor

: // We have to add the rectangle to the SCA.
: // First see if there is free space for the rectangle within the SCA.
: if pscaCurrent->cRects < MAX_SCA_RECTS
: : copy rect into SCA
: : calculate size and store in SCA
: : increment pscaCurrent->cRects
: else
: : // All of the SCA rects are used.
: : // Copy the new rect into the SCA at position (MAX_SCA_RECTS+1) and the
: : // problem then becomes:
: : // We have MAX_SCA_RECTS+1 rectangles, which we have to reduce
: : // to MAX_SCA_RECTS by merging two of the rectangles into a single
: : // rectangle.
: : // The pair of rects that we merge are the two that will cause the smallest
: : // increase in area.
: : initialise ulSmallestAreaIncrease to be maximum possible value
: : for (iRect1 = each rectangle in the SCA)
: : : for (iRect2 = iRect1+1 to MAX_SCA_RECTS+1)
: : : // This inner loop is the performance bottleneck.
: : : // Make it as fast as possible, if you can!!
: : : : if area increase of (iRect1,iRect2) merged < ulSmallestAreaIncrease
: : : : : set ulSmallestAreaIncrease to be area increase of (iRect1,iRect2)
: : : : : merged
: : : : : set best pair of rects to be (iRect1,iRect2)
: : : : endif
: : : endfor
: : endfor
: :
: : merge best pair of rects found into the slot originally occupied by Rect1
: : if rclNew was not one of those merged
: : : copy rclNew into vacant slot made by merging pair
: : endif

: endif
endfor

When the changed area tracking is active, this function is called by every function drawing to the screen. The routine must therefore be as efficient as possible (particularly in the inner loop) to minimize the hit on performance.

When the system switches from a Full Screen Session to PM, in Resurrection (in WINATTRS.ASM/IBMVGA32.DLL) a check is done, to see if the VRAM was successfully restored:

  • if it is so, the SCA could remain as they are because the PM screen is not going to be changed.
However, in this case we add a single pel to the active SCAs to ensure that some data is accumulated.
This is done by calling AccumulateScreenBound and passing a one-pel-wide rectangle;
  • if the VRAM has not been restored, then we know the whole screen is about to be redrawn, so we set the screen bounds NOW to be a single rectangle the size of the screen, by calling SetFullScreenBounds (SCBOUNDS.ASM/IBMVGA32.DLL): this improves redraw performance.

Deleting a SCA

Task performed by CloseScreenChangeArea (SCRAREA.ASM/IBMVGA32.DLL). The steps taken are:

  • Unlink the SCA instance from the linked list of SCAs.
  • Free the memory for the SCA instance.

In the usual example, if we close the 2nd SCA:

Memory loc.
              -------------------    pStartSCA = 250;
              | pscaNext = 200  |
              |                 |
              |      4th SCA    |
   0x250      -------------------

              -------------------
              | pscaNext = 150  |
              |                 |
              |      3th SCA    |
   0x200      -------------------

              -------------------
              | pscaNext = 100  |
              |                 |
              |      2nd SCA    |
   0x150      -------------------

              -------------------
              | pscaNext = 0    |
              |                 |
              |      1st SCA    |
   0x100      -------------------

we will get:

Memory loc.
              -------------------    pStartSCA = 250;
              | pscaNext = 200  |
              |                 |
              |      4th SCA    |
   0x250      -------------------

              -------------------
              | pscaNext = 100  |
              |                 |
              |      3th SCA    |
   0x200      -------------------

              -------------------
              | pscaNext = 0    |
              |                 |
              |      1st SCA    |
   0x100      -------------------

If the last remaining SCA is being freed, then pStartSCA is set to NULL. If the latest created SCA is being freed, then pStartSCA is set to the address of the SCA created immediately before it.

Compression/Decompression

Compressed data format

Task performed by compress_rect (COMPRESS.ASM/IBMDEV32.DLL). The compressed data that is passed between Display Drivers uses a private format (i.e. no external application/program has the right to examine, alter, or make any assumptions about the content of this data). This allows the compression method to be improved in later versions of the Driver. Definitions of the data structures are:

1) PACKET HEADER

dd   total_data_packet_length (including header)
dw   data_format

2) RECTANGLE HEADER

dw   xLeft
dw   yBottom
dw   xRight
dw   yTop

3) RECTANGLE DATA

The rectangle data is split into individual rows. Each row is split into run-length encoded blocks("cells"), each of which comprises a length field followed by one or more data fields.

The size of both the length and data fields is one byte (as specified by the data_format field in the packet header).

The following encoding rules are used:

If the length field contains a "positive" value (most significant bit not set) then the following single data field is repeated (length) times. Since the data field size is 8 bits, this value will be limited to a maximum of 127.

If the length field contains a "negative" value (most significant bit set) then (length - m.s. bit) fields of non-repeating data follow. Since the data field size is 8 bits, this value will be limited to a maximum of 127.

If the length field is zero and the following field is non-zero, the non-zero field is a count of the number of times that the single previous row is repeated.

Since the data field size is 8 bits, this value will be limited to a maximum of 127.

This will only appear as the first cell in a row, and only after there has been at least one row of data.

If the length field is zero and the following field is zero, the next (i.e. third) field is a count of the number of times that the previous pair of rows are repeated.

Since the data field size is 8 bits, this value will be limited to a maximum of 127.

This will only appear as the first cell in a row, and only after there have been at least two rows of data.

The following example shows the hexadecimal values of an 4bpp compressed bitmap:

03 04  FA 04 05 07 06 08 02 ...........  00 03  00 00 04  ...
lf df  lf df df df df df df              lf df  lf df df
cell1        cell2                       celln  celln+1

This bitmap would expand as follows (one-digit value represents a color index for a single pixel):

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row1

                                 do three more identical rows (celln):

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row2
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row3
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row4

                                 do four pairs of identical couples (celln+1)

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row5
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row6

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row7
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row8

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row9
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row10

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row11
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row12

Standard Golomb Run Length Encoding compression is inefficient at compressing 2x2 dither patterns, which are commonly used by Presentation Manager Display Drivers. The modified compression algorithm handles these patterns efficiently because:

  • the data field size is such that when compressing a row, pairs of adjacent pels are put in each data field.
  • when searching for duplicate scanlines, the algorithm also searches for duplicate scanline pairs which will match and compress patterns that repeat on alternate scanlines.

The actual pixel data is stored in Motorola format i.e. the most significant bits of a byte contain the leftmost pels. So, if we have a pair of pixels (PEL1,PEL2):

  • PEL1 goes in bits 7..4
  • PEL2 goes in bits 3..0

All 4bpp data is defined as indices into the standard VGA palette. (See Appendix A for details of these formats).

All changed Drivers must convert their own internal format into one of these "standard" formats before transmission (see Section 4.0).

Data conversion

The changed Display Drivers use differing internal data formats:

VGA:    4bpp planar
XGA:    4bpp packed, 8bpp packed, 16bpp packed

When data is transmitted between Display Drivers it is done so at the lowest bpp of the two Drivers communicating: so if the VGA Display Driver will have to replay data sent by a XGA Display Driver, the XGA data will have to be 4bpp, planar or linear.

In the same way, if we want to send data from a VGA to an XGA Display Driver, we will query the VGA data at 4bpp, planar or linear.

Therefore the conversion routines required by the VGA Display Driver are:

VGA (4bpp planar)

internal format   required format

4bpp planar   -> 4bpp packed (compression)


        external data format    internal format

(decompression) 4bpp packed  -> 4bpp linear

The conversions from packed to planar and vice versa are assisted by using a lookup table to "split" packed bytes into bits that can conveniently be reassembled into planar format (and vice versa).

Seamless Windows support

In OS/2 2.1, Seamless Windows is supported by allowing the Windows Display Driver to draw directly on the Presentation Manager screen. This means that Seamless Window updates do not go through the Presentation Manager drawing functions, and therefore will not update the active SCA in the usual way. Seamless Windows therefore requires special treatment.

Prior to drawing on the PM screen, the Seamless Windows Driver calls the PM Driver through an exported Entry Point (SeamlessExcludeCursor in SCBOUNDS.ASM/IBMVGA32.DLL) to exclude the PM cursor from the area that it is about to draw in.

Modified VGA Display Driver intercepts this call, and passes the rectangle coordinates to the bounds accumulation routine.

During PM Display Driver initialization, Seamless Windows must be granted addressability to all data and code that it will access during the call to the SeamlessExcludeCursor routine. Since we want to add the supplied rectangle to all active SCAs, which could reside anywhere in the Display Driver heap, what we do is having a single, static SSA, called SeamlessSCA.

All Seamless bounding rectangles will be accumulated into this SCA, and then merged with the contents of the normal SCAs when one is queried (using GetScreenChangeArea).

So, at init time, in DeviceSeamlessInit (EGASTATE.ASM/IBMDEV32.DLL) the Seamless Driver is given access to the SeamlessExcludeCursor routine and to the SeamlessSCA structure. The addresses of this data are stored in smDCAF and in smSeamlessSCA global variables, owned by the Window Driver.

Before writing to the screen, the Seamless Driver will call - via a thunk (STHUNK.ASM/ IBMDEV32.DLL) - the SeamlessExcludeCursor routine, in SCBOUNDS.ASM.

The code will check whether the new bounds accumulation is needed by checking the value of the pStartSCA pointer, and will call AccumulateScreenBound, thus causing the rectangle supplied to SeamlessExcludeCursor to be added only to scaSeamless. When a SCA is queried (using GetScreenChangeArea) the Seamless SCA is merged with all of the active SCAs, and then reset to be null.

New Entry Points

OpenScreenChangeArea (in SCRAREA.ASM/IBMVGA32.DLL)

OpenScreenChangeArea PROC SYSCALL USES EBX ESI EDI,

hdc    :HDC,
hddc   :PVOID,
FunN   :ULONG

USER PARAMETERS:

HDC hdc:
      any DC handle

RETURN:

HSCA hsca:
          handle of the new SCA
GPI_ERROR:
          There was no memory available, so return an error.
          The memory allocation failure will have been
          logged by the memory allocation routine

ERRORS:

PMERR_INV_HDC
PMERR_FUNCTION_NOT_SUPPORTED
PMERR_MEMORY_ALLOCATION_ERR

MAIN TASKS:

This routine will allocate a data area internal to the display
driver in which the driver will accumulate screen changes. It
returns a 32 bit handle which is required to identify the area in
GetScreenChangeArea and CloseScreenChangeArea calls

This entry point first enters the Driver by calling enter_driver (ENTER.ASM/IBMVGA32.DLL); this routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks. If enter_driver fails, OpenScreenChangeArea returns GPI_ERROR. Before going ahead, a check is done on the fDCAFEnabled flag to see if the IBMDEV32.DLL currently in use supports these new features (see Section 1.1): if it is not true, the PMERR_FUNCTION_NOT_SUPPORTED error is logged and the call returns GPI_ERROR.

Then simply attempts to create a new SCA instance (see Section 2.2), allocating a memory area internal to the display driver. This is done by calling AllocMem, a common routine in MEMMAN.ASM/IBMVGA32.DLL.

Then the new created SCA gets initialised to null, and added to the SCA linked list. If the creation is OK, and this is the first SCA to be opened then we must start accumulating screen bounds, by telling the Graphic Engine to turn on the COM_SCR_BOUND flag.

The GreSetProcessControl is used to do so. The setting of this bit will cause subsequent screen drawing functions to accumulate the clipped bounds into the active SCAs.

A pointer to the SCA is returned as the 32 bit handle to the SCA. This handle is required to identify the area in GetScreenChangeArea and CloseScreenChangeArea calls. If there is not enough memory to create the SCA then GPI_ERROR is returned. Before returning, exit_driver (ENTER.ASM) is called.

GetScreenChangeArea (in SCRAREA.ASM/IBMVGA32.DLL)

GetScreenChangeArea PROC SYSCALL USES EBX EDI ESI,

hdc    :HDC,
hsca   :PVOID,
phrgn  :PVOID,
hddc   :PVOID,
FunN   :ULONG

USER PARAMETERS:

HDC hdc:
      any valid DC
HSCA hsca:
      handle of the SCA to be queried; to be valid, it
      nust be obtained from a previous call to OpenScreenChangeArea
PHRGN phrgn:
      pointer to a region handle

ERRORS:

PMERR_INV_HDC
PMERR_FUNCTION_NOT_SUPPORTED
PMERR_NO_HANDLE

RETURN:

TRUE:
     if passed a handle we recognize and all was OK;
FALSE:
     in all other cases.

MAIN TASKS:

This routine takes a Screen Change Area handle, and for the SCA
identified adds its rectangles to the region pointed to by the
phrgn parameter.
The SCA is reset to NULL as a result of this call.

This entry point first enters the Driver by calling enter_driver (ENTER.ASM/IBMVGA32.DLL); this routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks. If enter_driver fails, GetScreenChangeArea returns FALSE. Before going ahead, a check is done on the fDCAFEnabled flag to see if the IBMDEV32.DLL currently in use supports these new features (see Section 1.1): if it is not true, the PMERR_FUNCTION_NOT_SUPPORTED error is logged and the call returns FALSE.

Next step is to check if there are any rects in the Seamless SCA SeamlessSCA: if it is so, it merges them with all of the currently active SCAs. This is done by the MergeSeamlessSCAWithGlobalSCAs routine (SCBOUNDS.ASM/IBMVGA32.DLL), that will call AccumulateScreenBound for every rectangle in scaSeamless. At the end, the cRects field of scaSeamless is set to zero.

Then a check is done on the passed SCA handle: if it is zero, PMERR_NO_HANDLE is logged and FALSE is returned.

If the handle is OK, the routine goes on tracking along the linked list until we either find the SCA to be queried, or reach the end of the list. In the first case, GreCombineRectRegion is called for all the rectangles in the SCA, so adding (CRGN_OR) them to the region passed by the user.

If the call is always successful, the cRects field of the SCA gets set to 0, so resetting the SCA; otherwise, if GreCombineRectRegion returns an error, we set the stored data in the SCA to be a single rectangle the size of the screen, so that any data we may have missed will be picked up next time we are called. In the latter case, FALSE is still returned for logging purposes.

If the end of the list was reached without finding the needed SCA, a PMERR_NO_HANDLE error is logged, and a FALSE return code is returned.

If everything went OK, the routine calls exit_driver and returns TRUE.

CloseScreenChangeArea (in SCRAREA.ASM/IBMVGA32.DLL)

CloseScreenChangeArea PROC SYSCALL USES EBX EDI ESI,

hdc    :HDC,
hsca   :PVOID,
hddc   :PVOID,
FunN   :ULONG

USER PARAMETERS:

HDC hdc:
     any valid DC
HSCA hsca:
     handle of the SCA to be closed; to be valid, it
     must be obtained from a previous call to
     OpenScreenChangeArea

ERRORS:

PMERR_INV_HDC
PMERR_FUNCTION_NOT_SUPPORTED
PMERR_NO_HANDLE

RETURN:

Error conditions are not obvious here, but we will return TRUE
if passed a handle we recognize, and FALSE in all other cases.

MAIN TASKS:

This routine frees the data area internal to the display driver,
identified by the SCA handle, which was accumulating screen
changes. It returns a Boolean value indicating success or failure.

This entry point first enters the Driver by calling enter_driver (ENTER.ASM/IBMVGA32.DLL); this routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks. If enter_driver fails, CloseScreenChangeArea returns FALSE. Before going ahead, a check is done on the fDCAFEnabled flag to see if the IBMDEV32.DLL currently in use supports these new features (see Section 1.1): if it is not true, the PMERR_FUNCTION_NOT_SUPPORTED error is logged and the call returns FALSE.

Then a check is done on the passed SCA handle: if it is zero, PMERR_NO_HANDLE is logged and FALSE is returned.

Then the entry point goes on checking that the hsca parameter matches one of the already created SCAs, and if so removes the SCA from the linked list and frees the SCA memory (see Section 2.4), by using FreeMem (in MEMMAN.ASM/IBMVGA32.DLL), and updates the linked list.

If this is the latest created SCA to be closed (i.e. there are no more in the linked list) then the Graphics Engine function GreSetProcessControl is called to turn off the COM_SCR_BOUND bit on subsequent calls to the Display Driver. The resetting of this bit means that no further screen bounds accumulation will occur until another call is made to OpenScreenChangeArea.

If the hsca parameter does not match any of the existing SCAs, a rc equal to FALSE is returned.

Before returning, exit_driver is called.

GetScreenBits (in SCRAREA.ASM/IBMVGA32.DLL and SCRBITS.ASM/IBMDEV32.DLL)

GetScreenBitsStub  PROC SYSCALL,
 hdc       :HDC,
 hrgn      :ULONG,
 pDest     :PVOID,
 pulLength :PVOID,
 flCmd     :ULONG,
 hddc      :PVOID,
 FunN      :ULONG

This stub, in (SCRAREA.ASM/IBMVGA32.DLL) is the entry point whose address is put in the dispatch table: it first checks the fDCAFEnabled flag to see if the IBMDEV32.DLL currently in use supports these new features: if this is not true, the PMERR_FUNCTION_NOT_SUPPORTED error is logged and the call returns FALSE. If it is true, the actual routine, provided by IBMDEV32.DLL, and whose address has been loaded at init time - is called.

GetScreenBits PROC SYSCALL USES EBX EDI ESI,
 hdc       :HDC,
 hrgn      :ULONG,
 pDest     :PVOID,
 pulLength :PVOID,
 flCmd     :ULONG,
 hddc      :PVOID,
 FunN      :ULONG

USER PARAMETERS:

HDC hdc: any valid direct (Screen) DC handle
HRGN hrgn: the area of the screen to be fetched. It can be either a valid region handle or a pointer to a (inclusive) RECTL, specified by a flag in the flCmd parameter.
PBYTE pDest: a pointer to the memory buffer where the compressed data will be written.
PULONG pulLength: a pointer to a ULONG, which must contain the length of the memory buffer pointed to by pBuffer. Valid values range from 2071 bytes to 64K bytes. Upon exit, the ULONG contains the number of bytes stored in the memory buffer.
ULONG flCmd: option flags. These specify the format (bits per pel, linear or planar) of the data to be put in the memory buffer, and whether the hrgn parameter contains a region handle or a pointer to a RECTL. Options available are:
GSB_OPT_4BPP 0000H
GSB_OPT_LINEAR 0000H
GSB_OPT_PLANAR 0008H
GSB_OPT_HRGN 0010H
The options are ORed together to make the flCmd parameter.

RETURN:

1: the entire area was successfully compressed in buffer. Upon exit, the supplied region/rectangle will be NULL;
2: a subset of the area was saved in buffer, because the buffer was not big enough for the whole area. Upon exit, the supplied region/rectangle will be updated to contain the area that was NOT compressed. This implies that GetScreenBits will have to be called again to complete the area compression.
0: an error occurred

ERRORS:

 PMERR_FUNCTION_NOT_SUPPORTED
 PMERR_INV_IN_PATH
 PMERR_INV_IN_AREA
 PMERR_INV_LENGTH_OR_COUNT
 PMERR_INV_FORMAT_CONTROL
 PMERR_INV_IMAGE_DIMENSION
 PMERR_INV_DC_TYPE
 PMERR_PEL_NOT_AVAILABLE

MAIN TASKS:

This routine queries a region of screen pixel data and saves it into the memory provided by the caller. It is compressed, and can be converted into a format suitable for another supported display device and will stop either when:
  • the supplied memory area is full
  • the requested region has been returned.
The region can be specified as either:
  • a pointer to a single rectangle (RECTL - long values)
  • a region handle
setting the GSB_OPT_HRGN flag in the flCmd parameter accordingly (if it is set, take a region; if it is not set, take a rect).
If a RECTL is specified then it is assumed to be inclusive.
The function modifies the supplied rectangle/region to indicate the area that was NOT returned in the call. If the whole requested region was returned the rectangle/region will be a null area.
The supplied DC must be direct - it is the source of the pixel data.
This is not a drawing primitive, therefore no correlation, bounds accumulation or drawing will take place.

This entry point first checks if the passed hddc is valid, the it enters the Driver by calling enter_driver (ENTER.ASM/IBMVGA32.DLL); this routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks. If enter_driver fails, GetScreenBits returns FALSE.

The next task is to fill in - by calling init_devinfo_struc (in SCRBITS.ASM/IBMDEV32.DLL) - a device description structure, containing data such as the screen dimensions, the DC type and so on.

Then GetScreenBits performs some initial error tests, checking the FunN parameter and the DC type: if the COM_PATH or the COM_AREA flags are set in FunN, the routine logs PM_INV_IN_PATH or PMERR_INV_IN_AREA, and returns 0; moreover, if the DC is not a direct one, the PMERR_INV_DC_TYPE is logged and the routine returns 0.

The next step is a check on the parameters for validity: the buffer size must range between MIN_BUFFER_SIZE (346 bytes, the length of a buffer containing only a compressed row, at the maximum resolution and when the input row only contains non-repeating data) and 65536 bytes (limit due to the PHUNK memory), otherwise the PMERR_INV_LENGTH_OR_COUNT is logged and the routine returns 0; the flCmd must be set with one of more of the flags we know, otherwise the PMERR_INV_FORMAT_CONTROL error will be logged and the return code will be 0.

Moreover, we must check that the PM is in foreground (fGrimReaper TRUE), otherwise the PMERR_PEL_NOT_AVAILABLE error will be logged and the routine will return 0. If all of these checks are OK, the CompressScreenBits routine (in COMPRESS.ASM/IBMDEV32.DLL) is called.

CompressScreenBits checks if the format is valid; this is done by creating a word containing the internal (screen) format and the external (requested) format.

The gsb_formats_table table is then scanned for a match; each entry in this table consists of a pair of bytes. The first (low) byte represents the internal data format, the second byte represents the external (requested) data format.

The pairs that can be requested are:

Internal (screen) format         Requested format

GSB_OPT_4BPP or GSB_OPT_PLANAR,  GSB_OPT_4BPP or GSB_OPT_LINEAR
GSB_OPT_4BPP or GSB_OPT_PLANAR,  GSB_OPT_4BPP or GSB_OPT_PLANAR

If no match is found then the requested format is invalid, the PMERR_INV_FORMAT_CONTROL is logged and GetScreenBits returns FALSE.

If a match is found then the requested format is valid, and the index into the table tells us which conversion routine should be used. The gsb_row_conv_table contains the addresses of the routines to perform the conversion/compression for each combination of internal formats. Obviously the order of this table MUST match the GSB_FORMATS_TABLE above.

If the query area is a rectangle, we make it exclusive and put it into arclBuffer (our local buffer).

If the query area is a region, we ask the Graphics Engine (by GetRegionBox) for the bounding box, putting it into rclBound. Then we ask for the first 10 rectangles from the query area, putting them in arclBuffer.

A check is done to see if the bounding box (for the region) or the rectangle exceeds the screen dimension: if it is so, PMERR_INV_IMAGE_DIMENSION is logged and the call return 0.

Then we move the destination pointer past the packet header and we exclude the cursor from the bounding rectangle using exclude_pointer (in DCAFMAC.INC).

Next step is a main loop start:

For each rectangle in the arclBuffer local buffer

adjust the current rectangle coordinates if necessary. We alter the coords according to format and bits/pel so that we do not have to worry about the masking associated with compressing/decompressing partial bytes. 4bpp formats are rounded to 8 pel boundaries because the destination could be another VGA (planar).
call compress_rect in COMPRESS.ASM/IBMDEV32.DLL, getting as return whether or not the output buffer is full
if the output buffer is full
: break
endif
if no more rectangles in the local buffer and a region was supplied and
        there are more rects in engine
: subtract the already compressed rectangles from the supplied region using
   GreCombineRectRegion (CRGN_DIFF)
: reload local buffer with more rects from engine
endif

endfor

We can get out from the cycle above for two reasons:

  • no more rectangles to process: reset the provided region or rectangle using GreSetRectRegion;
  • no more room in the output buffer: subtract the already processed rectangles from the region or the processed area from the provided rectangle.

At the end, we fill in the total number of bytes written into packet header, reenable the cursor by unexclude_pointer in DCAFMAC.INC, we exit the driver by exit_driver and return with a code indicating full (return code 1) or partial data returned (return code 2).

compress_rect (in COMPRESS.ASM/IBMDEV32.DLL)

if free bytes in output buffer < size of (RECTANGLE HEADER + WORST_CASE_ROW_LENGTH)

set a variable saying that there is no room in the output buffer
return

else

write out rect header
for iRow = each row of rect
: // Check for duplicate scanlines
: if iRow > first row
: : if row[iRow] matches row[iRow-1]
: : : count subsequent matching rows
: : : write duplicate scanline code + count
: : endif
: endif
: // Check for duplicate scanline pairs
: if iRow > second row and iRow < last row
: : if row[iRow] matches row[iRow-2] and row[iRow+1] matches row[iRow-1]
: : : count subsequent matching row pairs
: : : write duplicate scanline pair code + count
: : endif
: endif
: // Compress the row
: call appropriate row compression function
: if free bytes in output buffer < WORST_CASE_ROW_LENGTH
: : set a variable saying that there is no room in the output buffer
: : update the passed rectangle to contain the area that was not compressed
: : return
: endif
endfor

return endif

There is a separate compression function for each of the two valid src/dst format combinations (in COMPRESS.ASM):

- compress_row_4pl_4    (src 4bpp planar, dst 4bpp packed)
                             calling: PackBuffer;
                                      compression routine
- compress_row_4pl_4pl  (src 4bpp planar, dst 4bpp planar)
                             calling: plane_select_next (to access the VRAM)
                                      compression routine

plane_select_next is a macro aimed to directly access the VRAM.

PackBuffer (in PACKING.ASM/IBMDEV32.DLL) performs the planar to packed conversion, picking up the planar data directly from VRAM.

The planar to packed conversion basically involves taking the four bits (one for each plane) of a pel and ORing them together so that plane 3 ends up in the leftmost position and plane 0 ends up in the rightmost position. This is done 4 pels at a time using a table (PackTable).

SetScreenBits (in SCRAREA.ASM/IBMVGA32.DLL and SCRBITS.ASM/IBMDEV32.DLL)

SetScreenBitsStub PROC SYSCALL,

hdc       :HDC,
pBuffer   :PVOID,
cBytes    :ULONG,
hrgn      :ULONG,
hddc      :PVOID,
FunN      :ULONG

This stub (in SCRAREA.ASM/IBMVGA32.DLL) is the entry point whose address is put in the dispatch table: it first checks the fDCAFEnabled flag to see if the IBMDEV32.DLL currently in use supports these new features: if this is not true, the PMERR_FUNCTION_NOT_SUPPORTED error is logged and the call returns FALSE. If it is true, the actual routine, provided by IBMDEV32.DLL, and whose address has been loaded at init time - is called.

SetScreenBits PROC SYSCALL USES EBX EDI ESI,

hdc       :HDC,
pBuffer   :PVOID,
cBytes    :ULONG,
hrgn      :ULONG,
hddc      :PVOID,
FunN      :ULONG

USER PARAMETERS:

HDC hdc:
 any valid direct (Screen) DC handle, that has selected into it
 a bitmap the same size as the screen from which the source data
 was read.
PBYTE pBuffer:
 a pointer to the memory buffer where the source (compressed)
 data is located.
ULONG cBytes:
 the length of the memory buffer pointed to by pBuffer. This must
 be the same value as returned by the corresponding GetScreenBits
 call.
HRGN hrgn:
 a valid region handle. The area that is updated in the memory
 bitmap by this call is added to the region identified by this
 handle.

RETURN:

1:
  the supplied data is successfully decompressed into the memory
  bitmap;
0:
  an error occurred.

ERRORS:

PMERR_FUNCTION_NOT_SUPPORTED
PMERR_INV_IN_PATH
PMERR_INV_IN_AREA
PMERR_INV_LENGTH_OR_COUNT
PMERR_INV_DC_TYPE
PMERR_BITMAP_NOT_SELECTED
PMERR_INV_RECT
PMERR_INV_IMAGE_DIMENSION

MAIN TASKS:

SetScreenBits takes compressed data (generated by a previous call to GetScreenBits) from a buffer and decompresses it into the currently selected memory bitmap. The call is only valid for a memory DC that has a bitmap selected that is the same size as the screen on the machine where the GetScreenBits was performed.

There is no clipping. If a rectangle exceeds the bitmap dimensions then the function will terminate immediately with an error logged. The bitmap may be left in a partially drawn state as prior rectangles may have been copied into it.

This is a drawing primitive, therefore correlation, boundary accumulation and drawing could take place.

However, we will totally ignore the function bits, and will always just do the drawing and nothing else.

The routine may be passed a region handle, in which case the area defined by the set bits will be added to the region.

The VGA driver may be passed 4bpp planar or packed data.

This entry point first checks if the passed hddc is valid, the it enters the Driver by calling enter_driver (ENTER.ASM/IBMVGA32.DLL); this routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks. If enter_driver fails, SetScreenBits returns FALSE.

Then it performs some initial error tests, checking the FunN parameter and the DC type. If the COM_PATH or the COM_AREA flags are set in FunN, the routine logs PM_INV_IN_PATH or PMERR_INV_IN_AREA, and returns 0; moreover, if the DC is not a direct one, the PMERR_INV_DC_TYPE is logged and the routine returns 0.

One more check is done on the DC to see if a bitmap is selected in it: the PMERR_BITMAP_NOT_SELECTED error is logged and the routine returns 0 if there is no bitmap selected.

If the passed data length is zero then we can exit immediately, returning 1. Next check is if the data length is at least the size of the header, and that the passed data length matches with the data length given in the packet header: otherwise log PMERR_INV_LENGTH_OR_COUNT and return 0.

If all these test went OK, DecompressScreenBits (EXPAND.ASM/IBMVGA32.DLL) is called. Next step in this routine is to check that the coupling local format and received data format is valid: this is done by comparing the external (data received) format against the entries in the RectProcessingTable; this table is defined as an array of valid source and destination format pairing, containing - for the only two valid couples - the pointers to the decompression/conversion routines.

Then, the main cycle starts:

for next byte in the input buffer < last buffer in the input buffer
: check that the next rectangle in the buffer is valid, otherwise log
: PMERR_INV_RECT and return 0.
: call the required decompression/conversion routine:
: if source data is 4bpp packed
: // expand the rectangle, convert to planar and return the pointer to the
: // next rectangle in the input buffer
: : call Packed4bppRectangle
: otherwise // source data is 4bpp planar
: // expand the rectangle and return the pointer to the next rectangle in
: // the input buffer
: : call Planar4bppRectangle
: endif
: subtract the just decompressed rectangle from the passed region using
   GreCombineRectRegion (CRGN_DIFF)
endfor

return 1

The Packed4bppRectangle and Planar4bppRectangle routines (in EXPAND.ASM/IBMVGA32.DLL) perform the decompression of the data:

  • Packed4bppRectangle, after decompressing a scanline - according to the rules in Section 1.3 - into an intermediate buffer, calls the UnPackBuffer in order to convert from packed to planar format.

UnPackBuffer is provided by IBMDEV32.DLL and its address must be loaded at init time (see Section 1.1); it puts the planar data into a memory planar arrangement, using the UnPackTable conversion table.

  • Planar4bppRectangle only decompresses the data according to the rules in the Section 1.3.

At the end, we exit the driver by exit_driver, returning 1.

Appendix A - Default color palette values

Default VGA (4bpp) palette
Index RRGGBB
0: 000000
1: 000080
2: 008000
3: 008080
4: 800000
5: 800080
6: 808000
7: 808080
8: CCCCCC
9: 0000FF
10: 00FF00
11: 00FFFF
12: FF0000
13: FF00FF
14: FFFF00
15: FFFFFF