New XGA Display Driver Entry Points

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation

Introduction

This document describes the additional entry points implementing the OS/2 Presentation Manager XGA Display Driver new capabilities to efficiently track the screen region that is updated by Presentation Manager drawing functions, and to handle the related data.

The XGA Display Driver modifications can be split into two distinct areas:

1) Efficient accumulation maintenance and querying of clipped screen bounds of all screen drawing operations. This is provided by the following new functions:

OpenScreenChangeArea
GetScreenChangeArea
CloseScreenChangeArea.

2) Compression of data from a specified area of the screen into a memory buffer, and (usually after transmission over a network) decompression of that data into an internal memory bitmap. Data conversion must be performed between the different modes (bpp) and the different data format (planar or packed), in order to allow the data interchange between XGA and VGA, between XGAs working in different internal formats, and in general between all the Display Drivers providing these entry points.

This is provided by the following new functions:

GetScreenBits
SetScreenBits.

The majority of the new code is contained within new modules that are simply linked in with the base Driver.

The areas of the core Driver code that must be modified are:

the Dispatch Table, which contains the new Entry Points
all drawing functions, which contain extra tests to determine whether they should accumulate clipped screen bounds and, if necessary, code to calculate the clipped bounding rectangle
Seamless Windows cursor exclusion code, which is modified to call the new bounds accumulation routine.

Dispatch Table

In the EDDEFLDB.C, where the dispatch table gets filled in with the Display Driver entry points, the 5 new entry points get included into the Driver Dispatch Table with the following Function Numbers:

GreOpenScreenChangeArea   0x4012
GreGetScreenChangeArea    0x4013
GreCloseScreenChangeArea  0x4014
GreGetScreenBits          0x401D
GreSetScreenBits          0x401E

Screen bounds

"Traditional" Presentation Manager bounds are unclipped, and use a single rectangle to define their limits.

In order to better define the bounding area, the changed XGA Display Driver has the ability to maintain one or more clipped, multirectangle regions (or Screen Change Areas) that are updated to indicate areas on the screen that have been drawn to.

This ability is provided by the following changes to the base Driver:

Three new entry points:
- OpenScreenChangeArea (Section 6.1)
- GetScreenChangeArea (Section 6.2)
- CloseScreenChangeArea (Section 6.3)

to create, query and delete Screen Change Areas.

Two bounds accumulation functions, which add a single rectangle to all of the currently active SCAs (Section 2.3).
An extra flag test in the path of every Display Driver drawing function (for COM_SCR_BOUND in the high-order word in the last parameter of the GreXX calls - FunN). If this flag is set and the drawing operation is going to the screen, then the drawing function passes a clipped bounding rectangle of the drawing primitive to the bounds accumulation functions described above. The code required to do this is very Driver specific (Section 2.3).
Interception of Windows cursor exclusion calls and passing the supplied exclusion rectangle to the new bounds accumulation function (Section 5.0)

Compression/decompression

The bounding functions described above efficiently track the regions on the screen that are updated by Presentation Manager drawing functions.

The XGA Display Driver also provides the ability to compress, decompress and (if necessary) convert the format of screen data.

This ability is provided by two new entry points:

GetScreenBits (Section 6.4)
SetScreenBits (Section 6.5).

Screen Change Areas

SCA definition

A key part of the modifications concerns the maintenance and tracking of SCAs. These Areas track regions of the screen that are altered by Display Driver drawing routines as efficiently as possible. This is done by using multiple rectangles to define the area, rather than just the usual single bounding rectangle provided by "standard" Presentation Manager bounding functions.

SCAs are maintained within the Driver in a SCA structure, defined in the new DCAF.H file:

typedef struct _SCA { /* sca */
struct _SCA *     pscaNext;                       /* Linked list pointer    */
ULONG             cRects;                         /* No. rects in area      */
RECTL             arcl[MAX_SCA_RECTS+1];          /* Rectangle dimensions   */
ULONG             aulRectSize[MAX_SCA_RECTS+1];   /* Cached rectangle sizes */
} SCA;

Instances of this structure are dynamically created/destroyed upon calls to OpenScreenChangeArea/CloseScreenChangeArea.

A global variable, pStartSCA, points to the latest created SCA instance. If pStartSCA is null then there are no active SCAs.

All SCA instances are linked together in a list using the pscaNext field of the SCA structure. A null value in this field indicates the end of the list (first created SCA). For example:

Memory loc.
              -------------------       pStartSCA = 250;
              | pscaNext = 200  |
              |                 |
              |      4th SCA    |
   0x250      -------------------

              -------------------
              | pscaNext = 150  |
              |                 |
              |      3th SCA    |
   0x200      -------------------

              -------------------
              | pscaNext = 100  |
              |                 |
              |      2nd SCA    |
   0x150      -------------------

              -------------------
              | pscaNext = 0    |
              |                 |
              |      1st SCA    |
   0x100      -------------------

Each SCA instance can store multiple rectangles, up to MAX_SCA_RECTS (14), which define the area on the screen that has changed since the SCA was created or last queried. These rectangles are stored in the array arcl[].

The number of rectangles stored within the array is kept in the cRects field (which will never exceed MAX_SCA_RECTS). If cRects is zero then the SCA is a null area - the initial state.

The remaining field in the structure, aulRectSize[], is an array containing the sizes of the rectangles in arcl[]. This is not strictly necessary, because the sizes can be calculated on the fly (using the dimensions in arcl[]). However, when accumulating a rectangle into a SCA, the size of each of the rectangles is frequently needed. Caching the rectangle sizes in this array saves us having to recalculate the sizes every time, resulting in better performance.

The SCA structure defines space for (MAX_SCA_RECTS+1) rectangles, but only MAX_SCA_RECTS are ever used to define the SCA. The extra rectangle is used to simplify the routine that accumulates rectangles into the SCA (see Section 2.3).

Creating a new SCA

This task is accomplished by OpenScreenChangeArea (SCRAREA.C).

To create a new SCA:

Allocate memory for the new SCA instance.
Set the cRects field to be zero.
Set the pStartSca global variable to point to the new SCA address.
Link the instance into the linked list of SCAs.

The new created SCA will be identified by a 32 bit handle, actually the address of the SCA location in the Display Driver.

Accumulating a rectangle into a SCA

All Display Driver functions that draw to the screen are modified to accumulate clipped bounding rectangles into all active SCAs when necessary. The drawing functions determine whether they should do this by examining the FunN (Function Number/COM_ flags) parameter. If the COM_SCR_BOUND flag is set, and the function is drawing to the screen, then bounding rectangles are accumulated into the active SCAs; otherwise no accumulation takes place. The setting of COM_SCR_BOUND is controlled by the Open/CloseScreenChangeArea functions (see Section 6 for details).

In every drawing call the following test is performed:

    if (DCAFBoundsRequired(FunN))
    {
      accumulate the bound

    }

where the DCAFBoundsRequired macro is defined in DCAFEXTF.H as follows:

#define DCAFBoundsRequired(FunN)      \
    ( (FunN & COM_SCR_BOUND) &&       \    /*  COM_SCR_BOUND is set          */
      (FunN & COM_DRAW) &&            \    /*  COM_DRAW is set               */
      (pdc->DCIDCType == OD_DIRECT) )      /*  the destination is the screen */

When the changed area tracking is not needed, COM_SCR_BOUND will never be set and the only difference in operation/performance of the XGA base Driver will be one additional check of the COM_SCR_BOUND flag per drawing function i.e. negligible.

Two routines (in SCBOUNDS.C) perform the bounding accumulation:

VOID AccumulateScreenBounds(PRECTL prclArgBound);
VOID AccumulateScreenBoundsThroughClips( pDevRect pBoundCoords, ULONG ulCoordType );

AccumulateScreenBounds:

This routine is called by the drawing calls that are able to pass pre-clipped bounding rectangle (actually only SetPel and PolyShortLine). Its task is to take the passed rectangle, and accumulates it into all the current SCAs. The passed rectangle is in exclusive SCREEN coords.

AccumulateScreenBoundsThroughClips:

This function takes the supplied (unclipped) bounding rectangle, intersects it with each of the clip rectangles in the DC and accumulates each of the clipped bounds into the active SCAs.

The supplied bounding rectangle can be in:

16-bit AI coordinates (COORD_AI): (origin is top-left of screen)
16-bit Device Coords (COORD_DEVICE_WORD): (origin is current DC origin)
32-bit Screen Coords: (origin is bottom-left of screen)

The ulCoordType parameter specifies which of these coords are being supplied.

This routine can be called from the same point in drawing functions as the ordinary (unclipped) bounds are accumulated. This minimizes the complexity and changes required in the main drawing code.

Before accumulating the bounding rectangle, according to the following accumulation algorithm, the coordinate conversion (if required) to exclusive SCREEN coordinates and the clipping of the passed rectangle must be performed. The clipping can be done against the DC cache clip rectangles, or may require to call back the Graphics Engine to get the clip set.

When a rectangle is added into a SCA, it is done so in such a way as to minimize the increase in area of the SCA.

The following algorithm does this:

for (pscaCurrent = each SCA in the linked list)
:
: // First check whether the new rect is already contained within this SCA
: for (rclCurrent = each rectangle in current SCA)
: : if rclNew lies completely within rclCurrent
: : : no more work - skip straight to next SCA
: : endif
: endfor

: // We have to add the rectangle to the SCA.
: // First see if there is free space for the rectangle within the SCA.
: if pscaCurrent->cRects < MAX_SCA_RECTS
: : copy rect into SCA
: : calculate size and store in SCA
: : increment pscaCurrent->cRects
: else
: : // All of the SCA rects are used.
: : // Copy the new rect into the SCA at position (MAX_SCA_RECTS+1) and the
: : // problem then becomes:
: : // We have MAX_SCA_RECTS+1 rectangles, which we have to reduce
: : // to MAX_SCA_RECTS by merging two of the rectangles into a single
: : // rectangle.
: : // The pair of rects that we merge are the two that will cause the smallest
: : // increase in area.
: : initialise ulSmallestAreaIncrease to be maximum possible value
: : for (iRect1 = each rectangle in the SCA)
: : : for (iRect2 = iRect1+1 to MAX_SCA_RECTS+1)
: : : // This inner loop is the performance bottleneck.
: : : // Make it as fast as possible, if you can!!
: : : : if area increase of (iRect1,iRect2) merged < ulSmallestAreaIncrease
: : : : : set ulSmallestAreaIncrease to be area increase of (iRect1,iRect2)
          merged
: : : : : set best pair of rects to be (iRect1,iRect2)
: : : : endif
: : : endfor
: : endfor
: :
: : merge best pair of rects found into the slot originally occupied by Rect1
: : if rclNew was not one of those merged
: : : copy rclNew into vacant slot made by merging pair
: : endif

: endif
endfor

When the changed area tracking is active, this function is called by every function that draws to the screen. The routine must therefore be as efficient as possible (particularly in the inner loop) to minimize the hit on performance.

Deleting a SCA

Task performed by CloseScreenChangeArea (SCRAREA.C). The steps taken are:

Unlink the SCA instance from the linked list of SCAs.
Free the memory for the SCA instance.

In the usual example, if we close the 2nd SCA:

Memory loc.

             -------------------       pStartSCA = 250;
             | pscaNext = 200  |
             |                 |
             |      4th SCA    |
  0x250      -------------------

             -------------------
             | pscaNext = 150  |
             |                 |
             |      3th SCA    |
  0x200      -------------------

             -------------------
             | pscaNext = 100  |
             |                 |
             |      2nd SCA    |
  0x150      -------------------

             -------------------
             | pscaNext = 0    |
             |                 |
             |      1st SCA    |
  0x100      -------------------

we will get:

Memory loc.
              -------------------       pStartSCA = 250;
              | pscaNext = 200  |
              |                 |
              |      4th SCA    |
   0x250      -------------------

              -------------------
              | pscaNext = 100  |
              |                 |
              |      3th SCA    |
   0x200      -------------------

              -------------------
              | pscaNext = 0    |
              |                 |
              |      1st SCA    |
   0x100      -------------------

If the last remaining SCA is being freed, then pStartSCA is set to NULL.

If the latest created SCA is being freed, then pStartSCA is set to the address of the SCA created immediately before it.

Compression/Decompression

Compressed data format

Task performed by CompressRect in COMPRESS.ASM.

The compressed data that is passed between Display Drivers uses a private format (i.e. no external application/program has the right to examine, alter, or make any assumptions about the content of this data). This allows the compression method to be improved in later versions of the Driver.

Definitions of the data structures are:

1) PACKET HEADER

dd   total_data_packet_length (including header)
dw   data_format

2) RECTANGLE HEADER

dw   xLeft
dw   yBottom
dw   xRight
dw   yTop

3) RECTANGLE DATA

The rectangle data is split into individual rows.

Each row is split into run-length encoded blocks("cells"), each of which comprises a length field followed by one or more data fields.

The size of both the length and data fields varies according to the format of the data being transmitted (as specified by the data_format field in the packet header):

4bpp: field size is one byte (8 bits)
8bpp, 16bpp: field size is two bytes (16 bits)

The following encoding rules are used:

If the length field contains a "positive" value (most significant bit not set) then the following single data field is repeated (length) times.

If the data field size is 8 bits, this value will be limited to a maximum of 127.

If the length field contains a "negative" value (most significant bit set) then (length - m.s. bit) fields of non-repeating data follow.

If the data field size is 8 bits, this value will be limited to a maximum of 127.

If the length field is zero and the following field is non-zero, the non-zero field is a count of the number of times that the single previous row is repeated.

If the data field size is 8 bits, this value will be limited to a maximum of 127. This will only appear as the first cell in a row, and only after there has been at least one row of data.

If the length field is zero and the following field is zero, the next (i.e. third) field is a count of the number of times that the previous pair of rows are repeated.

If the data field size is 8 bits, this value will be limited to a maximum of 127. This will only appear as the first cell in a row, and only after there have been at least two rows of data.

Example

The following example shows the hexadecimal values of an 8bpp compressed bitmap:

0003 0004 00FA 0405 0706 0802 0104 0903 0001 .... 0000 0003 0000 0000 0004 ...
 lf   df   lf   df   df   df   df   df   df        lf   df   lf   df   df
  cell1               cell2                         celln      celln+1

This bitmap would expand as follows (two-digit values represent a color index for a single pixel):

 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row1

                                  do three more identical rows (celln):

 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row2
 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row3
 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row4

                             do four pairs of identical couples (celln+1)

 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row5
 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row6

 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row7
 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row8

 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row9
 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row10

 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row11
 00 04 00 04 00 04 04 05 07 06 08 02 01 04 09 03 00 01 ............ row12

The following example shows the hexadecimal values of an 4bpp compressed bitmap:

 03 04  FA 04 05 07 06 08 02 ...........  00 03  00 00 04  ...
 lf df  lf df df df df df df              lf df  lf df df
 cell1        cell2                       celln  celln+1

This bitmap would expand as follows (one-digit value represents a color index for a single pixel):

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row1

                                      do three more identical rows (celln):

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row2
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row3
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row4

                              do four pairs of identical couples (celln+1)

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row5
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row6

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row7
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row8

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row9
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row10

 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row11
 0 4 0 4 0 4 0 4 0 5 0 7 0 6 0 8 0 2 ............ row12

Standard Golomb Run Length Encoding compression is inefficient at compressing 2x2 dither patterns, which are commonly used by Presentation Manager Display Drivers. The modified compression algorithm handles these patterns efficiently because:

for 4bpp and 8bpp the data field size is such that when compressing a row, pairs of adjacent pels are put in each data field. Note that there is no dithering at 16bpp.
when searching for duplicate scanlines, the algorithm also searches for duplicate scanline pairs which will match and compress patterns that repeat on alternate scanlines.

The actual pixel data is stored in Motorola format i.e. the most significant bits of a byte contain the leftmost pels. So, if we have a pair of pixels (PEL1,PEL2):

for the 4bpp format:

- PEL1 goes in bits 7..4
- PEL2 goes in bits 3..0

for the 8bpp format:

- PEL1 goes in bits 15..8
- PEL2 goes in bits  7..0

All 4bpp data is defined as indices into the standard VGA palette. All 8bpp data is defined as indices into the standard XGA palette. All 16bpp data is defined as XGA format 16-bit color values. (See Appendix A for details of these formats).

All changed Drivers must convert their own internal format into one of these "standard" formats before transmission (see Section 4.0).

Data conversion

The changed Display Drivers use differing internal data formats:

VGA: 4bpp planar
XGA: 4bpp packed, 8bpp packed, 16bpp packed

When data is transmitted between Display Drivers it is done so at the lowest bpp of the two Drivers communicating (or lower e.g. a pair of 16bpp XGA Drivers could communicate at 4bpp to reduce the amount of data transmitted).

Therefore the conversion routines required by the Display Driver are:

XGA (4bpp, 8bpp, 16bpp packed)

internal format   required format

16bpp packed  ->  8bpp packed (compression)
16bpp packed  ->  4bpp packed (compression)
16bpp packed  ->  4bpp planar (compression)
 8bpp packed  ->  8bpp packed (compression)
 8bpp packed  ->  4bpp packed (compression)
 8bpp packed  ->  4bpp planar (compression)
 4bpp packed  ->  4bpp planar (compression)

         external data format  internal format

(decompression)  8bpp packed -> 16bpp packed
(decompression)  8bpp packed ->  8bpp packed
(decompression)  4bpp packed -> 16bpp packed
(decompression)  4bpp packed ->  8bpp packed
(decompression)  4bpp planar -> 16bpp packed
(decompression)  4bpp planar ->  8bpp packed
(decompression)  4bpp planar ->  4bpp packed

The conversions from packed to planar and vice versa are assisted by using a lookup table to "split" packed bytes into bits that can conveniently be reassembled into planar format (and vice versa).

All conversions to planar format are done by first converting the bits per pel down to 4 (still in packed format) and then performing an optimized 4bpp packed to planar conversion. With conversions from 4bpp planar, a similar (reverse) process is performed - converting from 4bpp planar to 4bpp packed and then to the required packed destination format.

Conversions from 4bpp and 8bpp use a lookup table to efficiently translate the colors. Conversions from 16bpp cannot use a direct lookup table, because its size is prohibitive. Therefore colors have to be searched for on a "nearest color" basis in the destination color table. This is much slower than a simple table lookup. To improve performance, a cache of the most recently calculated colors is kept, which saves having to repeatedly recalculate commonly used colors.

Seamless Windows support

In OS/2 2.1, Seamless Windows is supported by allowing the Windows Display Driver to draw directly on the Presentation Manager screen. This means that Seamless Window updates do not go through the Presentation Manager drawing functions, and therefore will not update the active SCA in the usual way. Seamless Windows therefore requires special treatment.

Prior to drawing on the PM screen, the Seamless Windows Driver calls the PM Driver through an exported Entry Point (SeamlessExcludeCursor in SEAMTHNK.ASM) to exclude the PM cursor from the area that it is about to draw in.

Modified XGA Display Driver intercepts this call, and passes the rectangle coordinates to the bounds accumulation routine.

During PM Display Driver initialization, Seamless Windows must be granted addressability to all data and code that it will access during the call to the SeamlessExcludeCursor routine. Since we want to add the supplied rectangle to all active SCAs, which could reside anywhere in the Display Driver heap, what we do is having a single, static SSA, called scaSeamless.

All Seamless bounding rectangles will be accumulated into this SCA, and then merged with the contents of the normal SCAs when one is queried (using GetScreenChangeArea).

So, at init time, in InitializeSeamless (in SEAMLESS.C) the Seamless Driver is given access to the AccumulateScreenBounds routine, to the scaSeamless structure and to the DDT Display Driver control block, needed to retrieve the screen dimension at seamless time. The addresses of this data are stored in the SeamlessAddresses control block, owned by the Window Driver.

Before writing to the screen, the Seamless Driver will call SeamlessExcludeCursor, in SEAMTHNK.ASM. The exclusion rectangle will be passed in the following registers in AI coordinates (i.e. 16 bit inclusive, 0,0 is top left of screen):

left    top     right   bottom
cx      dx      si      di

The code will check whether the new bounds accumulation is needed by checking the value of the pStartSCA pointer, and - in the SeamlessExcludeCursor32 routine (in SEAMTHNK.ASM) will call _AccumulateSeamlessBounds (in SCBOUNDS.C) to cause the bound accumulation.

AccumulateSeamlessBound converts the passed rectangle to EXCLUSIVE SCREEN coordinates, clips it to the screen dimensions (sometimes Windows exceeds it!) and calls AccumulateScreenBounds, thus causing the rectangle supplied to SeamlessExcludeCursor to be added only to scaSeamless.

When a SCA is queried (using GetScreenChangeArea) the Seamless SCA is merged with all of the active SCAs, and then reset to be null.

New Entry Points

OpenScreenChangeArea (in SCRAREA.C)

DDIENTRY OpenScreenChangeArea( HDC       hdc,
                               PDC       pdcArg,
                               ULONG     FunN )

USER PARAMETERS:

             HDC hdc:
                  any DC handle

RETURN:
             HSCA hsca:
                        handle of the new SCA
             GPI_ERROR:
                        There was no memory available, so return an error.
                        The memory allocation failure will have been
                        logged by the memory allocation routine

ERRORS:
             PMERR_INV_HDC
             PMERR_MEMORY_ALLOCATION_ERR

MAIN TASKS:
             This routine will allocate a data area internal to the display
             driver in which the driver will accumulate screen changes. It
             returns a 32 bit handle which is required to identify the area in
             GetScreenChangeArea and CloseScreenChangeArea calls

This entry point first enters the Driver by calling EnterDriver (EDDGPROC.C) with the EDF_STANDARD | EDF_DONT_CLEAN flags set. This routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks.

Then simply attempts to create a new SCA instance (see Section 2.2), allocating a memory area internal to the display driver. This is done by calling AllocateMemory, a common routine in MEMMAN.C.

If the creation is OK, and this is the first SCA to be opened then we must start accumulating screen bounds, by telling the Graphic Engine to turn on the COM_SCR_BOUND flag.

The GreSetProcessControl is used to do so. The setting of this bit will cause subsequent screen drawing functions to accumulate the clipped bounds into the active SCAs.

Then the new created SCA gets initialised to null, and added to the SCA linked list. A pointer to the SCA is returned as the 32 bit handle to the SCA. This handle is required to identify the area in GetScreenChangeArea and CloseScreenChangeArea calls. If there is not enough memory to create the SCA then the PMERR_MEMORY_ALLOCATION_ERR error is logged and GPI_ERROR is returned.

Before returning, ExitDriver is called.

GetScreenChangeArea (in SCRAREA.C)

DDIENTRY GetScreenChangeArea( HDC        hdc,
                              HSCA       hsca,
                              PHRGN      phrgn,
                              PDC        pdcArg,
                              ULONG      FunN )

USER PARAMETERS:

             HDC hdc:
                 any valid DC

             HSCA hsca:
                 handle of the SCA to be queried; to be valid, it
                 nust be obtained from a previous call to OpenScreenChangeArea

             PHRGN phrgn:
                 pointer to a region handle

ERRORS:
             PMERR_INV_HDC
             PMERR_NO_HANDLE

RETURN:
             TRUE:
                 if passed a handle we recognize and all was OK;
             FALSE:
                 in all other cases.

MAIN TASKS:
             This routine takes a Screen Change Area handle, and for the SCA
             identified adds its rectangles to the region pointed to by the
             phrgn parameter.
             The SCA is reset to NULL as a result of this call.

This entry point first enters the Driver by calling EnterDriver (EDDGPROC.C) with the EDF_STANDARD | EDF_DONT_CLEAN flags set. This routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks.

Then, if there are any rects in the Seamless SCA scaSeamless, it merges them with all of the currently active SCAs. This is done by calling AccumulateScreenBounds for every rectangle in scaSeamless. At the end, the cRects field of scaSeamless is set to zero. Next step is to track along the linked list until we either find the SCA to be queried, or reach the end of the list. In the first case, GreCombineRectRegion is called for all the rectangles in the SCA, so adding (CRGN_OR) them to the region passed by the user. If the call is always successful, the cRects field of the SCA gets set to 0, so resetting the SCA, otherwise a FALSE rc is returned.

If the end of the list was reached without finding the needed SCA, a PMERR_NO_HANDLE error is logged by LogError, and a FALSE return code is returned.

If everything went OK, the routine calls ExitDriver and returns TRUE.

CloseScreenChangeArea (in SCRAREA.C)

DDIENTRY CloseScreenChangeArea( HDC      hdc,
                                HSCA     hsca,
                                PDC      pdcArg,
                                ULONG    FunN )

USER PARAMETERS:

             HDC hdc:
                  any valid DC

             HSCA hsca:
                  handle of the SCA to be closed; to be valid, it
                  must be obtained from a previous call to
                  OpenScreenChangeArea

ERRORS:
             PMERR_INV_HDC
             PMERR_NO_HANDLE

RETURN:
             Error conditions are not obvious here, but we will return TRUE
             if passed a handle we recognize, and FALSE in all other cases.

MAIN TASKS:
             This routine frees the data area internal to the display driver,
             identified by the SCA handle, which was accumulating screen
             changes. It returns a Boolean value indicating success or failure.

This entry point first enters the Driver by calling EnterDriver (EDDGPROC.C) with the EDF_STANDARD | EDF_DONT_CLEAN flags set. This routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks.

Then the entry point checks that the hsca parameter matches one of the already created SCAs, and if so removes the SCA from the linked list and frees the SCA memory (see Section 2.4), by using FreeMemory (in MEMMAN.C), and updates the linked list. If this is the latest created SCA to be closed (i.e. there are no more in the linked list) then the Graphics Engine function GreSetProcessControl is called to turn off the COM_SCR_BOUND bit on subsequent calls to the Display Driver. The resetting of this bit means that no further screen bounds accumulation will occur until another call is made to OpenScreenChangeArea.

If the hsca parameter does not match any of the existing SCAs, a rc equal to FALSE is returned and an error (PMERR_NO_HANDLE) is logged by LogError.

Before returning, ExitDriver is called.

GetScreenBits (in GETSCR.C)

DDIENTRY GetScreenBits( HDC     hdc,
                        HRGN    hrgnApp,
                        PBYTE   pDest,
                        PULONG  pulLength,
                        ULONG   flCmd,
                        PDC     pdcArg,
                        ULONG   FunN )

USER PARAMETERS:

             HDC hdc:
               any valid direct (Screen) DC handle

             HRGN hrgn:
               the area of the screen to be fetched. It can be either
               a valid region handle or a pointer to a (inclusive) RECTL,
               specified by a flag in the flCmd parameter.

             PBYTE pDest:
               a pointer to the memory buffer where the compressed data
               will be written.

             PULONG pulLength:
               a pointer to a ULONG, which must contain the length of the
               memory buffer pointed to by pBuffer. Valid values range from
               2071 bytes to 64K bytes.
               Upon exit, the ULONG contains the number of bytes stored in
               the memory buffer.

             ULONG flCmd
               option flags. These specify the format (bits per pel, linear
               or planar) of the data to be put in the memory buffer, and
               whether the hrgn parameter contains a region handle or a
               pointer to a RECTL.
               Options available are:

                 GSB_OPT_4BPP           0000H
                 GSB_OPT_8BPP           0001H
                 GSB_OPT_16BPP          0002H
                 GSB_OPT_LINEAR         0000H
                 GSB_OPT_PLANAR         0008H
                 GSB_OPT_HRGN           0010H



              The options are ORed together to make the flCmd parameter.
              A couple of generally applicable rules are:

              - data can only be queried at bits per pel less than or equal
                to the bits per pel of the screen.

              - 4 bits per pel is the only valid planar format.

RETURN:
              1:
                the entire area was successfully compressed in buffer. Upon
                exit, the supplied region/rectangle will be NULL;
              2:
                a subset of the area was saved in buffer, because the buffer
                was not big enough for the whole area. Upon exit, the supplied
                region/rectangle will be updated to contain the area that was
                NOT compressed. This implies that GetScreenBits will have to be
                called again to complete the area compression.
              0:
                an error occurred

ERRORS:
              PMERR_INV_IN_PATH
              PMERR_INV_IN_AREA
              PMERR_INV_LENGTH_OR_COUNT
              PMERR_INV_FORMAT_CONTROL
              PMERR_INV_IMAGE_DIMENSION
              PMERR_INV_DC_TYPE
              PMERR_PEL_NOT_AVAILABLE

MAIN TASKS:
             This routine queries a region of screen pixel data and saves it
             into the memory provided by the caller. It is compressed, and can
             be converted into a format suitable for another supported display
             device and will stop either when:

                - the supplied memory area is full
                - the requested region has been returned.

             The region can be specified as either:

                - a pointer to a single rectangle (RECTL - long values)
                - a region handle

             setting the GSB_OPT_HRGN flag in the flCmd parameter accordingly
             (if it is set, take a region; if it is not set, take a rect).
             If a RECTL is specified then it is assumed to be inclusive.

             The function modifies the supplied rectangle/region to indicate
             the area that was NOT returned in the call. If the whole requested
             region was returned the rectangle/region will be a null area.

             The supplied DC must be direct - it is the source of the pixel
             data.
             This is not a drawing primitive, therefore no correlation,
             bounds accumulation or drawing will take place.

The routine first enters the driver by calling EnterDriver with the EDF_STANDARD | EDF_DONT_CLEAN flags set (this routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks).

Then it performs some initial error tests, checking the FunN parameter and the DC type. If the COM_PATH or the COM_AREA flags are set in FunN, the routine logs PM_INV_IN_PATH or PMERR_INV_IN_AREA by calling LogError, and returns 0; moreover, if the DC is not a direct one, the PMERR_INV_DC_TYPE is logged and the routine returns 0.

The next step is a check on the parameters for validity: the buffer size must range between MIN_BUFFER_SIZE (2071 bytes, the length of a buffer containing only a compressed row, at the maximum resolution and when the input row only contains non-repeating data) and 65536 bytes (limit due to the PHUNK memory), otherwise the PMERR_INV_LENGTH_OR_COUNT is logged and the routine returns 0; the flCmd must be set with one of more of the flags we know, otherwise the PMERR_INV_FORMAT_CONTROL error will be logged and the return code will be 0.

Moreover, we must check that the PM is in foreground (fXGAdead TRUE), otherwise the PMERR_PEL_NOT_AVAILABLE error will be logged and the routine will return 0. If all of these checks are OK, the CompressScreenBits routine (in GETSCR.C) is called. CompressScreenBits checks if the format is valid; this is done by creating a word containing the internal (screen) format and the external (requested) format.

The aGSBValidDataFormats table is then scanned for a match; this table is defined as as an array of valid source and destination format pairing, containing - for each valid couple the following info:

typedef struct _VALID_DATA_FORMATS
{
    ULONG   ulSrcDstFormat;             // Combined src + dst codes
    PFN     pfnRowConversionRoutine;    // address of the conversion routine
                                        // for this pair
    PBYTE  *ppConvertTable;             // address to the conversion table, if
                                        // required
    PFN     pfnCreateConvertTable;      // address to a routine aimed to create
                                        // a conversion table
} VALID_DATA_FORMATS;

If matched, the format is valid; otherwise it is invalid and the PMERR_INV_FORMAT_CONTROL gets logged and the call returns 0. Then, from the aGSBValidDataFormats structure, the pointer to the compression function for the required format combination is retrieved, and copied in the Spad global structure.

The following step is to initialize the conversion cache by calling the InitializeCache routine in DCAFCNV.ASM (this is currently only required if source is 16bpp and destination is 8bpp or 4bpp).

Then we check whether we need a conversion table for this combination of formats: actually, it is required for all the couples, except for the ones with the 16 bpp as a source. These latest couples have a null pointer in the last two fields of their aGSBValidDataFormats structures.

If the internal format is 16bpp, then we don't need a conversion table, whatever the target format may be.

If the requested format is 8bpp:

if the hardware palette is the default, then no conversion is required. This is checked by comparing two global variables, hForeGroundPal and DirectDeviceDefaultPalette. If there is still an old conversion table,it will be freed by FreeConvertTable (in DCAFCNVT.C).
else a palette has been realized and we need to convert the internal indices into "standard" external format i.e. indices into the default 256 color XGA (fudged) palette.

If a conversion table has been previously created then we need to ensure that the mapping refers to the current palette (the internal palette may have changed since we calculated the conversion table). We can do this because there is a handy global variable, Seamlessdata.ulLastPalUpdate, that is incremented when the hardware palette is changed. We take a copy of this when the conversion table is created, and on subsequent calls check the value has not changed before using the table. If the value has changed then we throw away the old table and create a new one.

If the requested format is 4bpp:

we will always need a conversion table, but must ensure that it correctly maps the current hardware palette. We use the same scheme as for 8bpp in this case.

If the conversion table creation is needed (according to the scheme above), the proper creation function (stored in the pfnCreateConvertTable field of the specific pair) is called and the pointer to the just created table is stored in the ppConvertTable field.

The creation calls for the conversion table of valid format couple, all of them are defined in DCAFCNVT.C:

CreateConvertTable_8int_4ext: creates a 256-entry table of bytes to convert 8bpp to 4bpp. To do this, after the memory allocation, the NearestRestrictedColourIndex (in CONVINT.C) is called, in order to map the RGB values in the HWPalette table to the nearest entry in the VGAstandard default palette (see Appendix A).

CreateConvertTable_8int_8ext: creates a 256-entry table of bytes to convert 8bpp internal indices to 8bpp external indices (XGA default palette).

To do this, after the memory allocation, the NearestRestrictedColourIndex (in CONVINT.C) is called, in order to map the RGB values in the HWPalette table to the nearest entry in the FullSizeDeviceDefaultPalette palette (the 256-entry XGA default palette defined in EDDDATA.C).

CreateConvertTable_4int_4ext: creates a 256-entry table of bytes to convert 4bpp internal (fudged) format to 4bpp external (standard VGA) format.

We convert pairs of pels at a time, which is why the conversion table is 256 entries, rather than the expected 16. To do this, after the memory allocation, the NearestRestrictedColourIndex (in CONVINT.C) is called, in order to map the RGB values in the Reduced16DeviceDefaultPalette table (the 16-entry XGA default palette defined in EDDDATA.C) to the nearest entry in the VGA standard default palette (see the Appendix A). The remaining values in the external table are obtained by combining pairs of conversion values, in the following way:

for (i = 0; i < 256; i++)
{
   pConvertTable[i] = (abLocalConvertTable[i >> 4] << 4) |
                      (abLocalConvertTable[i & 0x0F]) ;
}

Next step is to calculate the bounding rectangle of the supplied area to be queried: we'll use GreGetRegionBox.

Then, if the query area is a region, we ask the Graphics Engine for the first 10 rectangles from the query area, putting their address in the Spad.prclCurrent pointer (our local buffer). If the query area is a rectangle, we make it exclusive and put its address in Spad.prclCurrent.

A check is done to see if the bounding box (for the region) or the rectangle is a valid ordered rectangle: otherwise, a PMERR_INV_IMAGE_DIMENSION is logged and the call returns 0. Moreover, the call will return 0 if the rectangle is empty; and, if the bounding box or the rectangle exceeds the screen dimension, PMERR_INV_IMAGE_DIMENSION is logged and the call return 0. Now we move the destination pointer past the packet header.

Then we exclude the cursor from the bounding rectangle using eddm_excludeCursor in EDDMCCRS.C, after rounding the bounding rectangle up to 8 pel boundaries, because the rectangles may be rounded up to 2 or 8 pel boundaries below (depending upon bpp).

Then we set up the drawing mode to always use the real XGA hardware, calling the SetDrawModeHard macro (in EDDMACRO.H).

After setting some ShadowXGARegs fields, we call WaitForRealHWFunction in HWACCESS.ASM to ensure that the hardware is ready, then we transfer some of the registers that we have just set up calling TransferShadowRegisters(TSR_MAP_A | TSR_COLOUR_MIX) in HWACCESS.ASM: at the moment, we can only transfer MAP_A and COLOR_MIX registers. MAP_B, COORDINATES and PIXELOP will be transferred later (when we know all of the values), since they depend on the rectangle in processing.

The a main loop starts:

For each rectangle in the Spad.prclCurrent local buffer: adjust the current rectangle coordinates if necessary. We alter the coords according to format and bits/pel so that we do not have to worry about the masking associated with compressing/decompressing partial bytes.

4bpp formats are rounded to 8 pel boundaries because the destination could be planar VGA.
8bpp formats are rounded to even pel boundaries because we transmit data in 16-bit fields i.e. two pels per data field.

: call CompressRect in COMPRESS.ASM, getting as return whether or not
         the output buffer is full
: if the output buffer is full
: : break
: endif
: if no more rectangles in the local buffer and a region was supplied and
         there are more rects in engine
: : subtract the already compressed rectangles from the supplied region using
    GreCombineRectRegion (CRGN_DIFF)
: : reload local buffer with more rects from engine
: endif
endfor

We can get out from the cycle above for two reasons:

no more rectangles to process: reset the provided region or rectangle using GreSetRectRegion;
no more room in the output buffer: subtract the already processed rectangles from the region or the processed area from the provided rectangle.

At the end, we fill in the total number of bytes written into packet header, reenable the cursor by reenable_cursor in EDDMCCRS.C, we exit the driver by ExitDriver and we return with a code indicating full (return code 1) or partial data returned (return code 2).

CompressRect (in COMPRESS.ASM)

if free bytes in output buffer < size of (RECTANGLE HEADER + WORST_CASE_ROW_LENGTH)

: set a variable saying that there is no room in the output buffer
: return
else
: write out rect header
: for iRow = each row of rect
: : // Check for duplicate scanlines
: : if iRow > first row
: : : if row[iRow] matches row[iRow-1]
: : : : count subsequent matching rows
: : : : write duplicate scanline code + count
: : : endif
: : endif
: : // Check for duplicate scanline pairs
: : if iRow > second row and iRow < last row
: : : if row[iRow] matches row[iRow-2] and row[iRow+1] matches row[iRow-1]
: : : : count subsequent matching row pairs
: : : : write duplicate scanline pair code + count
: : : endif
: : endif
: : // Compress the row
: : call appropriate row compression function
: : if free bytes in output buffer < WORST_CASE_ROW_LENGTH
: : : set a variable saying that there is no room in the output buffer
: : : update the passed rectangle to contain the area that was not compressed
: : : return
: : endif
: endfor
return
endif

There is a separate compression function for each of the valid src/dst format combinations (in COMPRESS.ASM): - compress_row_16_16 (src 16bpp packed, dst 16bpp packed)

                                calling only compression routine

- compress_row_16_8 (src 16bpp packed, dst 8bpp packed)

                                calling: convert_row_16pk_8pk;
                                         compression routine

- compress_row_16_4 (src 16bpp packed, dst 4bpp packed)

                                calling: convert_row_16pk_4pk;
                                         compression routine

- compress_row_16_4pl (src 16bpp packed, dst 4bpp planar)

                                calling: convert_row_16pk_4pk;
                                         convert_row_4pk_4pl;
                                         compression routine

- compress_row_8_8 (src 8bpp packed, dst 8bpp packed)

                                calling: convert_row_8pkint_8pkext;
                                         compression routine

- compress_row_8_4 (src 8bpp packed, dst 4bpp packed)

                                calling: convert_row_8pk_4pk;
                                         compression routine

- compress_row_8_4pl (src 8bpp packed, dst 4bpp planar)

                                calling: convert_row_8pk_4pk;
                                         convert_row_4pk_4pl;
                                         compression routine

- compress_row_4_4 (src 4bpp packed, dst 4bpp packed)

                                calling: convert_row_4pkint_4pkext;
                                         compression routine

- compress_row_4_4pl (src 4bpp packed, dst 4bpp planar)

                                calling: convert_row_4pkint_4pkext;
                                         convert_row_4pk_4pl;
                                         compression routine

Conversion between different data formats (e.g. 8bpp and 4bpp) is done independently from the data compression. i.e. if the source and destination formats differ, then the source data is first converted to the destination format (in an intermediate buffer defined on the PHUNK), and then compressed into the destination buffer. If the source and destination formats match then the data is compressed directly from the source to the destination. The data conversion routines use tables wherever possible to improve performance (see Section 4 for further details). The routines defined in COMPRESS.ASM are:

- convert_row_16pk_8pk (src 16bpp packed, dst 8bpp packed)

                                using NearestRestrictedDeviceDefaultPalette
                                to find, for every pixel, the nearest
                                entry in the FullSizeDeviceDefaultPalette
                                (the 256 entry default palette defined in
                                EDDDATA.C).

- convert_row_16pk_4pk (src 16bpp packed, dst 4bpp packed)

                                using NearestRestrictedDeviceDefaultPalette
                                      to find, for every pixel, the nearest
                                      entry in the StandardVGADefaultPalette

- convert_row_8pkint_8pkext (scr 8bpp packed, dst 8bpp packed)

                                using  ConvertTable_8int_8ext

- convert_row_8pk_4pk (src 8bpp packed, dst 4bpp packed)

                                using  ConvertTable_8int_4ext

- convert_row_4pk_4pl (src 4bpp packed, dst 4bpp planar)

                                changes the packed to planar format

- convert_row_4pkint_4pkext (src 4bpp packed, dst 4bpp packed)

                                     using  ConvertTable_4int_4ext

Additional work is done to make the required rows available for compression. Checks are performed to ensure that a row is available before accessing it: in the XGA Driver, VRAM is not directly accessible so the required rows have to be loaded into a locked buffer in System Memory before they can be worked on. There already exists a 64K buffer in the Driver called the PHUNK (PHysical chUNK) which is ideal for this purpose. Rectangles can easily exceed 64K in size, so this buffer will often be reloaded multiple times during the processing of a single rectangle.

SetScreenBits (in SETSCR.C)

DDIENTRY SetScreenBits( HDC      hdc,
                        PBYTE    pBuffer,
                        ULONG    cBytes,
                        HRGN     hrgn,
                        PDC      pdcArg,
                        ULONG    FunN )

USER PARAMETERS:

             HDC hdc:
               any valid direct (Screen) DC handle, that has selected into it
               a bitmap the same size as the screen from which the source data
               was read.

             PBYTE pBuffer:
               a pointer to the memory buffer where the source (compressed)
               data is located.

             ULONG cBytes:
               the length of the memory buffer pointed to by pBuffer. This
               must be the same value as returned by the corresponding
               GetScreenBits call.

             HRGN hrgn:
               a valid region handle. The area that is updated in the memory
               bitmap by this call is added to the region identified by this
               handle.


RETURN:
              1:
               the supplied data is successfully decompressed into the memory
               bitmap;

              0:
               an error occurred.


ERRORS:
              PMERR_INV_IN_PATH
              PMERR_INV_IN_AREA
              PMERR_INV_LENGTH_OR_COUNT
              PMERR_INV_DC_TYPE
              PMERR_BITMAP_NOT_SELECTED
              PMERR_INCOMPAT_COLOR_FORMAT
              PMERR_INV_RECT
              PMERR_INV_IMAGE_DIMENSION


MAIN TASKS:
             SetScreenBits takes compressed data (generated by a previous call
             to GetScreenBits) from a buffer and decompresses it into the
             currently selected memory bitmap. The call is only valid for a
             memory DC that has a bitmap selected that is the same size as the
             screen on the machine where the GetScreenBits was performed.

             There is no clipping. If a rectangle exceeds the bitmap
             dimensions then the function will terminate immediately with an
             error logged. The bitmap may be left in a partially drawn state
             as prior rectangles may have been copied into it.

             This is a drawing primitive, therefore correlation, boundary
             accumulation and drawing could take place.

             However, we will totally ignore the function bits, and will always
             just do the drawing and nothing else.

             The routine may be passed a region handle, in which case the area
             defined by the set bits will be added to the region.

             The XGA driver may be passed 4bpp planar, 4bpp packed, 8bpp packed
             and 16bpp packed data.

The routine first enters the driver by calling EnterDriver with the EDF_STANDARD | EDF_DONT_CLEAN flags set (this routine is called whenever the driver is entered, and gets the driver semaphore and performs entry checks).

Then it performs some initial error tests, checking the FunN parameter and the DC type. If the COM_PATH or the COM_AREA flags are set in FunN, the routine logs PM_INV_IN_PATH or PMERR_INV_IN_AREA by calling LogError, and returns 0; moreover, if the DC is not a direct one, the PMERR_INV_DC_TYPE is logged and the routine returns 0. One more check is done on the DC to see if a bitmap is selected in it: the PMERR_BITMAP_NOT_SELECTED error is logged and the routine returns 0 if there is no bitmap selected.

If the passed data length is zero then we can exit immediately, returning 1. Next check is if the data length is at least the size of the header, and that the passed data length agrees with the data length given in the packet header: otherwise log PMERR_INV_LENGTH_OR_COUNT and return 0.

Next step is to check that the coupling local format/received data format is valid: this is done by creating a word containing the internal (screen) format and the external (data received) format.

The aGSBValidDataFormats table is then scanned for a match; this table is defined as an array of valid source and destination format pairing, containing

for each valid couple the following info:

typedef struct _VALID_DATA_FORMATS
{
    ULONG   ulSrcDstFormat;           // Combined src + dst codes
    PFN     pfnRowConversionRoutine;  // address of the conversion routine for
                                      // this pair
    PBYTE  *ppConvertTable;           // address to the conversion table, if
                                      // required
    PFN     pfnCreateConvertTable;    // address to a routine aimed to create a
                                      // conversion table
} VALID_DATA_FORMATS;

If matched, the format is valid; otherwise it is invalid and the PMERR_INCOMPAT_COLOR_FORMAT gets logged and the call returns 0. Then, see if conversion is required for that couple, and if so then the pointer to the compression function for the required format combination is retrieved from the aGSBValidDataFormats structure, and called to create the conversion table.

The creation calls for the conversion table of valid format couple, all of them are defined in DCAFCNVT.C:

CreateConvertTable_4ext_4int: creates a 256-entry table of bytes to convert 4bpp external (default VGA) format to 4bpp internal (fudged).

We convert pairs of pels at a time, which is why the conversion table is 256 entries, rather than the expected 16.

To do this, after the memory allocation, the NearestRestrictedColourIndex (in CONVINT.C) is called, in order to map the entries in the VGA default table (see Appendix A) to the nearest

entry in the Reduced16DeviceDefaultPalette table (the 16-entry XGA default palette defined in EDDDATA.C).

The remaining values in the internal table are obtained by combining pairs of conversion values, in the following way:

               for (i = 0; i < 256; i++)
               {
                 pConvertTable[i] = (abLocalConvertTable[i >> 4] << 4) |
                                    (abLocalConvertTable[i & 0x0F]) ;
               }

CreateConvertTable_4ext_8int: creates a 16-entry table of bytes to convert 4bpp to 8bpp. After the memory allocation, the NearestRestrictedColourIndex (in CONVINT.C) is called, in order to map the entries in the standard VGA table (see Appendix A) to the nearest entry in the FullSizeDeviceDefaultPalette (the 256-entry XGA default palette defined in EDDDATA.C).
CreateConvertTable_4ext_16int: creates a 16-entry table of words to convert 4bpp to 16bpp. After the memory allocation, the RGB16FromPRGB2 (macro in CONVFUNS.H) is used to transform the VGA standard palette (see Appendix A) to 16 bpp values. Note that the VGA standard palette is in RGB2 format (see Appendix A). The entries are returned in Intel format:

              Bit7..Bit 0   Bit15..Bit8
                BYTE 0        BYTE 1

and then they must be converted to Motorola format:

              Bit15...Bit8   Bit7..Bit0
                 BYTE 0        BYTE 1

CreateConvertTable_8ext_16int: creates a 256-entry table of words to convert 8bpp external to 16bpp internal. After the memory allocation, the RGB16FromPRGB2 (macro in CONVFUNS.H) in order to transform the XGA device standard palette FullSizeDeviceDefaultPalette to 16 bpp values. The entries are returned in Intel format and must be then converted to Motorola format.; Then, check that the hardware is idle before we start using the PHUNK buffers by calling WaitForRealHWFunction in HWACCESS.ASM: then, the main cycle starts:

: for next byte in the input buffer < last buffer in the input buffer
: : check that the next rectangle in the buffer is valid, otherwise log
: : PMERR_INV_RECT and return 0.
: : calculate some data on the current rectangle (number of row, bytes per row,
: : etc)
: : if source data is with 8-bit data fields
: : // expand the rectangle and return the pointer to the next rectangle in the
: : // input buffer
: : : call ExpandRect8
: : otherwise // source data is with 16-bit data fields
: : // expand the rectangle and return the pointer to the next rectangle in the
: : // input buffer
: : : call ExpandRect16
: : endif
: : subtract the just decompressed rectangle from the passed region using
    GreCombineRectRegion (CRGN_DIFF)
: endfor
return 1

At the end, we exit the driver by ExitDriver, returning 1.

The ExpandRect8 and ExpandRect16 calls (in EXPAND.ASM) perform the decompression of the data and, if the internal Display Driver format is different from the data format, the conversion of the data takes place. The following routines (in DCAFCNV.ASM and EXPAND.ASM) are used:

convert_row_4pl_4pk (scr 4bpp planar, dst 4bpp packed) changes only the planes to linear
convert_row_4pl_4pkint (src 4bpp planar, dst 4bpp packed) calling: convert_row_4pl_4pk; convert_row_4pk_16pk
convert_row_4pl_8pk (scr 4bpp planar, dst 8bpp packed) calling: convert_row_4pl_4pk; convert_row_4pkt_8pk
convert_row_4pl_16pk (scr 4bpp planar, dst 16bpp packed) calling: convert_row_4pl_4pk; convert_row_4pkext_4pkint
convert_row_4pkext_4pkint (scr 4bpp packed, dst 4bpp packed) using ConvertTable_4ext_4int
convert_row_4pk_8pk (scr 4bpp packed, dst 8bpp packed) using ConvertTable_4ext_8int
convert_row_4pk_16pk (scr 4bpp packed, dst 16bpp packed) using ConvertTable_4ext_16int
convert_row_8pk_16pk (scr 8bpp packed, dst 16bpp packed) using ConvertTable_8ext_16int

Conversion between different data formats (e.g. 4bpp and 8bpp) is done independently from the data expansion. i.e. if the source and destination formats differ, then the source data is first expanded (in a intermediate buffer), and then converted to the destination format. If the source and destination formats match then the data is decompressed directly from the source buffer to the destination internal bitmap. As in GetScreenBits, the data conversion routines use tables wherever possible to improve performance (see Section 4 for further details).

Appendix A - Default color palette values

Default VGA (4bpp) palette

Index    RRGGBB
0:       000000
1:       000080
2:       008000
3:       008080
4:       800000
5:       800080
6:       808000
7:       808080
8:       CCCCCC
9:       0000FF
10:      00FF00
11:      00FFFF
12:      FF0000
13:      FF00FF
14:      FFFF00
15:      FFFFFF

Default XGA (8bpp) palette

Index   RRGGBB

0:              000000
1:              800000
2:              009200
3:              808000
4:              0000AA
5:              800080
6:              0092AA
7:              C1C1C1
8:              AAFFAA
9:              AAB6FF
10:             0049AA
11:             0049FF
12:             006D00
13:             006D55
14:             006DAA
15:             006DFF
16:             002400
17:             009255
18:             0024AA
19:             0092FF
20:             00B600
21:             00B655
22:             00B6AA
23:             00B6FF
24:             00DB00
25:             00DB55
26:             00DBAA
27:             00DBFF
28:             FFDBAA
29:             00FF55
30:             00FFAA
31:             FFFFAA
32:             2B0000
33:             2B0055
34:             2B00AA
35:             2B00FF
36:             2B2400
37:             2B2455
38:             2B24AA
39:             2B24FF
40:             2B4900
41:             2B4955
42:             2B49AA
43:             2B49FF
44:             2B6D00
45:             2B6D55
46:             2B6DAA
47:             2B6DFF
48:             2B9200
49:             2B9255
50:             2B92AA
51:             2B92FF
52:             2BB600
53:             2BB655
54:             2BB6AA
55:             2BB6FF
56:             2BDB00
57:             2BDB55
58:             2BDBAA
59:             2BDBFF
60:             2BFF00
61:             2BFF55
62:             2BFFAA
63:             2BFFFF
64:             550000
65:             550055
66:             5500AA
67:             5500FF
68:             552400
69:             552455
70:             5524AA
71:             5524FF
72:             554900
73:             554955
74:             5549AA
75:             5549FF
76:             556D00
77:             556D55
78:             556DAA
79:             556DFF
80:             559200
81:             559255
82:             5592AA
83:             5592FF
84:             55B600
85:             55B655
86:             55B6AA
87:             55B6FF
88:             55DB00
89:             55DB55
90:             55DBAA
91:             55DBFF
92:             55FF00
93:             55FF55
94:             55FFAA
95:             55FFFF
96:             000055
97:             800055
98:             002455
99:             8000FF
100:            802400
101:            802455
102:            8024AA
103:            8024FF
104:            804900
105:            804955
106:            8049AA
107:            8049FF
108:            806D00
109:            806D55
110:            806DAA
111:            806DFF
112:            080808
113:            0F0F0F
114:            171717
115:            1F1F1F
116:            272727
117:            2E2E2E
118:            363636
119:            3E3E3E
120:            464646
121:            4D4D4D
122:            555555
123:            5D5D5D
124:            646464
125:            6C6C6C
126:            747474
127:            7C7C7C
128:            FFDB00
129:            8B8B8B
130:            939393
131:            9B9B9B
132:            FFB6FF
133:            AAAAAA
134:            B2B2B2
135:            B9B9B9
136:            0024FF
137:            CCCCCC
138:            D1D1D1
139:            D8D8D8
140:            FFB6AA
141:            E8E8E8
142:            F0F0F0
143:            F7F7F7
144:            FFDBFF
145:            809255
146:            8092AA
147:            8092FF
148:            80B600
149:            80B655
150:            80B6AA
151:            80B6FF
152:            80DB00
153:            80DB55
154:            80DBAA
155:            80DBFF
156:            80FF00
157:            80FF55
158:            80FFAA
159:            80FFFF
160:            AA0000
161:            AA0055
162:            AA00AA
163:            AA00FF
164:            AA2400
165:            AA2455
166:            AA24AA
167:            AA24FF
168:            AA4900
169:            AA4955
170:            AA49AA
171:            AA49FF
172:            AA6D00
173:            AA6D55
174:            AA6DAA
175:            AA6DFF
176:            AA9200
177:            AA9255
178:            AA92AA
179:            AA92FF
180:            AAB600
181:            AAB655
182:            AAB6AA
183:            004955
184:            AADB00
185:            AADB55
186:            AADBAA
187:            AADBFF
188:            AAFF00
189:            AAFF55
190:            004900
191:            AAFFFF
192:            D50000
193:            D50055
194:            D500AA
195:            D500FF
196:            D52400
197:            D52455
198:            D524AA
199:            D524FF
200:            D54900
201:            D54955
202:            D549AA
203:            D549FF
204:            D56D00
205:            D56D55
206:            D56DAA
207:            D56DFF
208:            D59200
209:            D59255
210:            D592AA
211:            D592FF
212:            D5B600
213:            D5B655
214:            D5B6AA
215:            D5B6FF
216:            D5DB00
217:            D5DB55
218:            D5DBAA
219:            D5DBFF
220:            D5FF00
221:            D5FF55
222:            D5FFAA
223:            D5FFFF
224:            FFDB55
225:            FF0055
226:            FF00AA
227:            FFFF55
228:            FF2400
229:            FF2455
230:            FF24AA
231:            FF24FF
232:            FF4900
233:            FF4955
234:            FF49AA
235:            FF49FF
236:            FF6D00
237:            FF6D55
238:            FF6DAA
239:            FF6DFF
240:            FF9200
241:            FF9255
242:            FF92AA
243:            FF92FF
244:            FFB600
245:            FFB655
246:            E0E0E0
247:            A2A2A2
248:            838383
249:            FF0000
250:            00FF00
251:            FFFF00
252:            0000FF
253:            FF00FF
254:            00FFFF
255:            FFFFFF

XGA 16bpp format

This is a 5-6-5 format, i.e.

rrrrrggggggbbbbb

r = red (5 bits)
g = green (6 bits)
b = blue (5 bits)

   typedef struct _RGB2      /* rgb2 */
   {
      BYTE bBlue;            /* Blue component of the color definition */
      BYTE bGreen;           /* Green component of the color definition*/
      BYTE bRed;             /* Red component of the color definition  */
      BYTE fcOptions;        /* Reserved, must be zero                 */
   } RGB2;
   typedef RGB2 *PRGB2;