KEYBOARD.DCP File Format
Written by Martin Lafaix
Introduction
What's That?
Keyboard layouts are stored in KEYBOARD.DCP. These layouts are used when converting a raw scancode (that is, the value sent by the keyboard controller) to its corresponding value (that is, a symbol, like "A", or a virtual key code, like F1 or CapsLock).
Knowing this file format would allowed us to create customized layouts, or new layouts for specific keyboards, or whatever. Wouldn't it be nice?
Another advantage would be the possibility for us to write our own keyboard handler; that is, something which converts a scancode to a key value. Naturally, we can already do that, but we have to define the keyboard layout, which is, er, boring? Reinventing the wheel is not always funny!
Why?
Well, to please our beloved editor. <grin>
Contents
This article describes the OS/2 2.x KEYBOARD.DCP file format. It contains three main parts: the first one being the description properly-speaking, the second one explaining how to translate a raw scancode to an OS/2 key code and the third one describing a keyboard layout manipulation tool.
Credit
The first part of this paper is mainly based upon Ned Konz's SWAPDCP tool, available from your favorite ftp site.
KEYBOARD.DCP File Format
How is KEYBOARD.DCP Organized?
The first four bytes of KEYBOARD.DCP contain the index table offset (0-based), ito.
The first two bytes of the index table contain the index entry count, iec.
Following this index entry count are iec Index Entries. Each index entry is as follow:
typedef struct
{
  WORD    word1;
  BYTE    Country[2];	    /* i.e. "US"  */
  BYTE    SubCountryID[4];  /* i.e. "153" */
  WORD    word2;
  WORD    XTableID;	    /* i.e. 0x1b5 (437) */
  WORD    KbdType;
  ULONG   HeaderLocation;   /* of beginning of table (header)
*/
} IndexEntry;
Figure 1. The Index Entry structure.
| Field | Description | 
|---|---|
| word1 | unknown | 
| Country | The country name abbreviation ("US", "FR", ...). Note: The byte ordering is reversed. That is, the first character of the abbreviation is in Country[1], and the second character is in Country[0]. | 
| SubCountryID | The country's keyboard layout ID ("153 ", "189 ", "120 ",...). In some country (UK, France, Italy, ...) there exists different "main keyboard" layouts. This field reflects this information. | 
| word2 | unknown | 
| XTableID | The keyboard's layout codepage (437, 850, ...). | 
| KbdType | The keyboard type (0 for a 89 keys keyboard, 1 for a 101/102 keys keyboard). | 
| HeaderLocation | The corresponding layout table offset (0-based). | 
So, to find a specific keyboard layout, we have to (1) read the first four bytes to find the index table and (2) locate the specified index entry (Country, SubCountryID, XTableID and keyboard type). If such an entry exists, its HeaderLocation field contains the Keyboard Layout Table Entry offset.
The Keyboard Layout Table Entry
Each keyboard layout table entry contains a header, followed by key and accent definitions. The header is as follows:
/* code page header */
typedef struct XHeader
{
  WORD    XTableID;	  /* code page number */
  /* note: 32-bit wide field */
  struct
  {
    /* which shift key or key combo affects Char3 of each KeyDef */
    BITFIELD    ShiftAlt     :1; /* use shift-alt instead of ctrl-alt */
    BITFIELD    AltGrafL     :1; /* use left alt key as alt- graphics */
    BITFIELD    AltGrafR     :1; /* use right alt key as alt- graphics */
    /* other modifiers */
    BITFIELD    ShiftLock    :1; /* treat caps lock as shift lock */
    BITFIELD    DefaultTable :1; /* default table for the language */
    BITFIELD    ShiftToggle  :1; /* TRUE:. toggle, FALSE:. latch shiftlock */
    BITFIELD    AccentPass   :1; /* TRUE:. pass on accent keys and beep,
						FALSE:. just beep */
    BITFIELD    CapsShift    :1; /* caps-shift uses Char5 */
    BITFIELD    MachDep      :1; /* machine-dependent table */
    /* Bidirectional modifiers */
    BITFIELD    RTL	     :1; /* Right-To-Left orientation */
    BITFIELD    LangSel      :1; /* TRUE:. National language layout
					FALSE:. English language layout */
    /* default layout indicator */
    BITFIELD    DefaultLayout:1; /* default layout for the country */
  } XTableFlags1;
  WORD    KbdType;		/* keyboard type */
  WORD    KbdSubType;		/* keyboard sub-type */
  WORD    XtableLen;		/* length of table */
  WORD    EntryCount;		/* number of KeyDef entries */
  WORD    EntryWidth;		/* width in bytes of KeyDef entries */
  BYTE    Country[2];		/* country ID, i.e. "US" */
  WORD    TableTypeID;		/* Table type, 0001=OS/2 */
  BYTE    SubCountryID[4];	/* sub-country ID, ASCII, i.e. "153 " */
  WORD    Reserved[8];
} XHeader;
Figure 2. The Table header structure.
| Field | Description | 
|---|---|
| XTableID | The keyboard layout codepage. | 
| XTableFlags | Layout's flags (see the Layout Flags subsection below). | 
| KbdType | The keyboard type (0 = 89 keys, 1 = 101/102 keys). | 
| KbdSubType | The keyboard subtype (??? 0). | 
| XtableLen | The table length. The length (in bytes) includes this header. | 
| EntryCount | The number of KeyDef entries. | 
| EntryWidth | The width in bytes of KeyDef entries. | 
| Country | The country ID (bytes reversed, that is, you got "SU" for US). | 
| TableTypeID | The table type (1 for OS/2). | 
| SubCountryID | The subcountry ID ("153", "189", "120", ...). | 
| Reserved | Unknown. | 
This table header is followed by EntryCount KeyDef entries. Each KeyDef entry is as follows:
/* Key definition, one per scan code in table */
typedef struct
{
  WORD    XlateOp;
  BYTE    Char1;
  BYTE    Char2;
  BYTE    Char3;
  BYTE    Char4;
  BYTE    Char5;
} KeyDef;
Figure 3. The KeyDef structure.
| Field | Description | 
|---|---|
| XlateOP | The 9 lower bits specify the key type (see the key type subsection below). 
 | 
| char1 | The "standard" value | 
| char2 | The "shifted" value | 
| char3 | The "Alted" value | 
| char4 | |
| char5 | 
The specific meaning of the charx fields depends on the XlateOP value, as explained in the key type subsection. The default value of the EntryWidth field (in the header) is 7, but, if this value is bigger, then, there are additional charx fields in the KeyDef structure. (Namely, you have EntryWidth - sizeof(XlateOP) charx fields, with sizeof(XlateOP) being 2.)
These EntryCount KeyDef entries are then followed by the Accent Table, which contains the seven Accent Table Entries (one per possible accent - if there's more than seven accents, the seventh entry contains the additional entries).
Each Accent Table Entry is as follows:
/*  Accent Table Entry, one per accent, up to seven accent */
typedef struct
{
  CHAR charOrg;			/* The key's ASCII value, i.e. "a" */
  CHAR charRes;			/* The resulting ASCII value, i.e. "…" */
} TRANS;
typedef struct
{
  BYTE AccentGlyph;		/* What to show while waiting for a key */
  BYTE byte1;
  BYTE byte2;
  BYTE byte3;
  BYTE byte4;
  BYTE byte5;
  TRANS aTrans[20];		/* The allowed substitutions */
} AccentTableEntry;
Figure 4. The AccentTableEntry structure.
The seventh entry has a slightly different format. If there's more than 6 accents, its first byte contains the length of the seventh Accent Table Entry. This entry is then followed by a byte whose contents is the length of the eighth entry, and so on:
Figure 5. The seventh Accent Table Entry structure.
There's no "end of entry" indicator.Use the XTableLen field to check it:
AccentTableEntryLen = XTableLen - sizeof(XHeader) - EntryCount * EntryWidth.
The first six entries take 6*sizeof(AccentTableEntry) = 276 bytes. The remaining Accent Table entries fit in AccentTableEntryLen-276 bytes. When the sum of the size of the additional entries (that is, l7 + l8 + ...) reaches this value, it's done.
If there's less than seven accents, the first byte of the seventh entry is 0x00.
So, the accent-key+ASCII-key translation process is quite easy: when an accent key is pressed, just remember the accent code, and optionally display the corresponding glyph (AccentGlyph). Then, wait for another key to be pressed. If this key accepts the remembered accent (that is, the corresponding bit in the 7 high bits of XlateOP is set), locate the corresponding charRes in the aTrans array of the Accent Table Entry (yes, you'll have to browse this array until you find the right charOrg!). If the pressed key does not accept the remembered accent (or if you can't find the corresponding charOrg in aTrans), just beep. You're done!
Layout Flags
Various flags describe the layout's behavior.
| Flag | Description | 
|---|---|
| ShiftAlt | When this flag is 1, it allows you to use "Shift+Alt" instead of "Ctrl+Alt" when accessing the third glyph of a key. (With 89-keys keyboards.) | 
| AltGrL | When this flag is 1, the left "Alt" key is used for "AltGr". (With 101/102-keys keyboards.) | 
| AltGrR | When this flag is 1, the right "Alt" key is used for "AltGr". (With 101/102-keys keyboards.) | 
| ShiftLock | When this flag is 1, "CapsLock" acts as a "ShiftLock" key. That is, when CapsLock is ON, pressing a "Shift" key unset it. When this flag is 0, pressing a "Shift" key temporarily toggle the CapsLock state, but it is restored when releasing the "Shift" key. | 
| DefaultTable | When this flag is 1, the layout uses the default country codepage. | 
| ShiftToggle | With 89-keys keyboards, set this flag in conjunction with ShiftLock. That is, when ShiftLock is 1, set ShiftToggle to 1, and when ShiftLock is 0, set ShiftToggle to 0. On 101/102-keys keyboards, set this flag to 0. | 
| AccentPass | When this flag is set to 1, accents keys (aka. dead keys) are allowed. | 
| CapsShift | Unknown. It's 1 for all Swiss keyboards, 0 otherwise. | 
| MachDep | Set this flag to 1 when there's more than one physical layout sharing the same country code. See the DefaultLayout flag below. | 
| RTL | When this flag is 1, the layout use the Bidirectional Languages support. (RTL stands for Right-to-Left.) | 
| LangSel | When this flag is 1, the layout is a National one (Arabic or Hebrew, usually). When this flag is 0, the layout is an English one. Note: When RTL is 0, this flag is not used - set it to 0. | 
| DefaultLayout | When MachDep is 1, set this flag to 1 to denote the fact that this layout is the default one. Otherwise, set it to 0. | 
- Key Type
The "Key type" value specifies the meaning of the charx fields in the KeyDef entries.
Note: In the following table, "xxx'ed'" means holding down xxx while pressing the key. And, if an entry contains "???", well, its meaning is not completely known...
| Value | Signification | 
|---|---|
| 0 | An empty entry. That is, no key produces this scan code. | 
| 0x01 | AlphaKey. This is an alphabetical key. The char1 field contains the unshifted key value. The char2 field contains the shifted key value. If the "accent" bits are not null, they specify the allowed accents. Each AlphaKey can generates a "Ctrl'ed" value, when used in conjunction with a "Ctrl" key. In this case, the generated value is char1-96. | 
| 0x02 | SpecKey. This key generates an unshifted value (char1) and a shifted value (char2). It does not generates an "Alted", "AltGr'ed" or "Ctrl'ed" value. | 
| 0x03 | SpecKeyC. This key generates can generate a value when "AltGr'ed". char3 contains this (optional) value. If char3 is non null, then, it's the value. If char3 is less than 32, the value is an accent. Otherwise, it's a "normal" symbol. char1 and char2 contain the unshifted and shifted key value. When CapsLock is ON, the order is reversed. That is, the unshifted value is char2 while the shifted value is char1. | 
| 0x04 | SpecKeyA. This key can generate a value (char3) when "AltGr'ed". char1 and char2 contain the unshifted and the shifted key value, respectively. It does not depend on the CapsLock value (that is, char1 is always the unshifted value while char2 is always the shifted value, whether CapsLock is ON or not). | 
| 0x05 | SpecKeyCA ??? | 
| 0x06 | FuncKey. The char1 field contains the function key number. All other fields contain 0. | 
| 0x07 | PadKey. This is a "NumPad" key. The char1 field contains the padkey indices (0 = "7", 1="8", 2="9", 3="-", 4="4", 5="5", 6="6", 7="+", 8="1", 9="2", 10="3", 11="0" and 12 = "."). 
 | 
| 0x08 | SpecCtlKey. This keys generates "control" code (that is, ASCII code in range 0..31). char1 contains the unshifted control code, while char2 contains the shifted control code. All other fields contain 0. | 
| 0x09 | The PrtSc key. | 
| 0x0a | The SysReq key. | 
| 0x0b | AccentKey. This key generates "accent" code. char1 is the unshifted accent (in range 1..7) and char2 is the shifted accent (also in range 1..7). char5 has the value of char1. If char3 is not null, it's the value generated when "alted". char4 is 0. | 
| 0x0c | ShiftKey. A shift or control key. If char1 is 0x1 it's the right "Shift" key. It's the left one if char1 is 0x2. If char1 is 0x4, it's a "Ctrl" key. In this case, char2 is 0x1 and char3 is 0x4, and char4 & char5 are 0. Otherwise, char2..char5 are 0. | 
| 0x0d | ToggleKey ??? | 
| 0x0e | The Alt key. | 
| 0x0f | The NumLock key. | 
| 0x10 | The CapsLock key. | 
| 0x11 | The ScrollLock key. | 
| 0x12 | XShiftKey ??? | 
| 0x13 | XToggleKey ??? | 
| 0x14 | SpecKeyCS. When CapsLock is OFF, this key generates char1 when unshifted and char2 when shifted. When CapsLock is ON, this key generates char4 when unshifted and char5 when shifted. When used in conjunction with the "AltGr" key, this key generates char3 (whether CapsLock is ON or not). If char3 is less than 32, it's an accent. | 
| 0x15 | SpecKeyAS ??? (my guess: it's like SpecKeyCS, except that char3 is a "control" code (that is, an ASCII value in range 1..31). | 
| 0x1a | ExtExtKey. The new "cursor" keys. That is, the keys which are missing in a 89-keys keyboard. | 
| <other> | unknown | 
ScanCode to Key Value Conversion
In this section, we'll describe the keyboard scancode to ASCII char conversion scheme. Doing such a conversion is required when you want to write your own keyboard handler, or when, for any other reason, you have to deal directly with scancodes.
Portion of code will be given in REXX. Please refer to SHOWDCP.CMD for missing functions.
And, it's just a scheme, it's not a complete and fully-functional scancode to key value converter. <grin>
Determining the Required Keyboard Layout
The first thing to do is to load the correct keyboard layout. We first have to find the current Country/CodePage value by using the DosQueryCp/DosQueryCtryInfo functions (Refer to Control Program Guide and Reference for more information on those two functions).
We then have to find the current keyboard type - that is, an old (89 keys) one or a "new" one (101/102 keys). If you know how to do this, please, let me know!
The last step is to find the user's desired keyboard layout. The easiest way to do this is probably to scan the CONFIG.SYS, or to provide a command parameter. (We need both country abbrev and subcountry code.)
We could then load the corresponding keyboard layout.
/* Loading the keyboard layout ** ** rcp is current codepage ** rcn is current country abbrev (US, FR, ...) ** rss is current subcountry (153, 189, 120) ** rty is current keyboard type */ ito = readl() call charin infile,ito iec = readw() do iec call getindex if (country = rcn) & (rss = subcntr) & (rcp = cp) & (rty = type) then leave end /* do */ if (country \= rcn) | (rss \= subcntr) | (rcp \= cp) | (rty \= type) then do say "Keyboard layout not found!" exit end call getentry offset
Having read the layout header, we then have to read the corresponding KeyDefs:
do i = 1 to EntryCount call getkeydef /* here, we have to store is somewhere... */ ... end
And then come the last initialization step:
Determining the Accent Key Conversion Table
free = tablelen - 40 - entrycount * entrywidth j = 1 empty = 1 do while free > 0 call getaccententry j /* We here have to store it somewhere ... */ ... j = j + 1 free = free - len end /* do */
We are now ready...
Converting a Scancode to a Key Value
The first thing to do is to maintain some Boolean values containing various special keys status (CapsLock, NumLock, ScrollLock and Alt/Shift/Ctrl). We have to remember the last accent key pressed, too.
We then have to handle each key type.
/* scan is the current key scancode */ type = key.scan.keytype select /* One of the many "toggle" key */ when type = 'CAPSLOCK' then CapsLock = \CapsLock ... when type = 'ALPHAKEY' then do if \PendingAccent then select when CtrlPressed then code = key.scan.char1 - 96 when AltPressed then code = '00'x||scan when ShiftPressed & \CapsLock then code = key.scan.char2 when ShiftPressed then code = key.scan.char1 when CapsLock then code = key.scan.char2 otherwise code = key.scan.char1 end /* select */ else select when \allowedAccent() | CtrlPressed | AltPressed then call BEEP when ShiftPressed & \CapsLock then code = addAccent(key.scan.char2) when ShiftPressed then code = addAccent(key.scan.char1) when CapsLock then code = addAccent(key.scan.char2) otherwise code = addAccent(key.scan.char1) end end ... otherwise /* Not a known type! */ say 'Type 'type' unknown!' end
The complete implementation is, err, left as an exercise. <grin>
Using Keyboard Layouts
A very important note:be sure to have a safe copy of your original KEYBOARD.DCP before any experimentation! You've been warned. <grin>
In SHOWDCP.ZIP is included a REXX script which allows you to explores/modify keyboard layouts. It's usage is as follow:
showdcp usage Usage: showdcp [param] file [country [subcountry [cp [type]]]] [file2] -h - Access Help ; -v[n] - View matching layouts. n is the detail level ; -x - Extract matching layouts ; -a - Add layout to file ; -ds,t,c - Define a key ; -sk1,k2 - Swap key k1 and key k2. country = country code (US, FR, ...) or * subcountry = subcountry code (189, 120, ...) or * cp = code page (437, 850, ...) or * type = keyboard type (0 = 89 keys, 1 = 101/102 keys) or *
Figure 6. The showdcp command usage
| Param | Description | 
|---|---|
| -h | Shows usage information (see Figure 6); | 
| -v[n] | Displays keyboard layouts which match the given specification. n is the detail level, in range 0- 4 (1 is the default): 
 | 
| -x | Extracts the matching layouts in file2; | 
| -a | Adds the layouts contained in file2 to file (if file does not exist, it's created); | 
| -ds,t,d | Defines the key associated with scancode s. t is the key type and d is the definition. It's an hexadecimal string (see Example 3 below); | 
| -sk1,k2 | Swaps keys definition. k1 and k2 are the scancode to swap. | 
Example 1
If you want to create a restricted KEYBOARD.DCP file which contains all US layouts, but nothing else, enter the following commands:
showdcp -x c:\os2\keyboard.dcp US * * * dummy showdcp -a mylayout.dcp dummy
And then, replace the DEVINFO=KBD... line in your CONFIG.SYS with:
DEVINFO=KBD,US,D:\TMP\MYLAYOUT.DCP
Example 2
If you want to find all layouts which use the 863 (Canadian French) codepage, enter:
showdcp -v c:\os2\keyboard.dcp * * 863
You'll get something like:
Operating System/2 Keyboard.dcp file viewer Version 1.05.000 Jan 25 1995 (C) Copyright Martin Lafaix 1994, 1995 All rights reserved.
Example 3
If you want to change the definition of the "A" key in the standard French layout so that the key caps are reversed, enter:
copy c:\os2\keyboard.dcp mykbd.dcp showdcp -d16,1E05,4161000000 MYKBD.DCP FR 189 * 1
If you want to try the newly defined layout, and assuming your boot drive is "C:", enter:
copy c:\os2\keyboard.dcp keyboard.org copy mybkd.dcp c:\os2\keyboard.dcp keyb fr189
Then, experiment with it (with the French layout, the "A" key is on the US "Q" key). And, after that, restore your initial configuration:
keyb us copy keyboard.org c:\os2\keyboard.dcp
The "-d" parameter revisited
The "-d" parameter is immediately followed by the key scancode. It's a decimal number. It's then followed by a comma. The key type comes next. It's a 16bits hexadecimal value. Its 9 low bits contains the key type properly speaking, while the 7 high bits contain the allowed accents. The key type is followed by another comma, which is followed by the key definition. It's an hexadecimal string. The first two hexadecimal digits corresponds to the char1 field, and so on. In the previous example, we are assigning 0x41 to the char1 field ("A"), 0x61 ("a") to char2, and 0x00 to all remaining fields (char3, char4 and char5). If the key definition string does not defines all fields, the value of the non-specified fields is not modified. In the previous example, we could have used "4161" instead of "4161000000".
Be really careful when using the "-d" parameter.
Summary
Please tell me what you think!
I hope you find this article useful and informative. If you like what I have done, please let me know; if not, please tell me why. I will use your comments to make upcoming papers better.
Thank you!

