EDM/2 - KEYBOARD.DCP File Format

KEYBOARD.DCP File Format

Introduction
What's That?
Keyboard layouts are stored in KEYBOARD.DCP These layouts are used when converting a raw scancode (that is, the value sent by the keyboard controller) to its corresponding value (that is, a symbol, like "A", or a virtual key code, like F1 or CapsLock).
Knowing this file format would allowed us to create customized layouts, or new layouts for specific keyboards, or whatever. Wouldn't it be nice?
Another advantage would be the possibility for us to write our own keyboard handler; that is, something which converts a scancode to a key value. Naturally, we can already do that, but we have to define the keyboard layout, which is, er, boring ? Reinventing the wheel is not always funny!
Why? Well, to please our beloved editor. <grin>
Contents This article describes the OS/2 2.x KEYBOARD.DCP file format. It contains three main parts: the first one being the description properly-speaking, the second one explaining how to translate a raw scancode to an OS/2 key code and the third one describing a keyboard layout manipulation tool.
Credit The first part of this paper is mainly based upon Ned Konz's SWAPDCP tool, available from your favorite ftp site.
KEYBOARD.DCP File Format
How is KEYBOARD.DCP Organized?
The first four bytes of KEYBOARD.DCP contain the index table offset (0-based), ito.
The first two bytes of the index table contain the index entry count, iec. Following this index entry count are iec Index Entries. Each index entry is as follow:
typedef struct { WORD word1; BYTE Country[2]; /* i.e. "US" */ BYTE SubCountryID[4]; /* i.e. "153 " */ WORD word2; WORD XTableID; /* i.e. 0x1b5 (437) */ WORD KbdType; ULONG HeaderLocation; /* of beginning of table (header) */ } IndexEntry;
Figure 1. The Index Entry structure.

Field Description

word1 unknown

Country The country name abbreviation ("US", "FR", ...). Note: The byte ordering is reversed. That is, the first character of the abbreviation is in Country[1], and the second character is in Country[0].

SubCountryID The country's keyboard layout ID ("153 ", "189 ", "120 ", ...). In some country (UK, France, Italy, ...) there exists different "main keyboard" layouts. This field reflects this information.

word2 unknown

XTableID The keyboard's layout codepage (437, 850, ...).

KbdType The keyboard type (0 for a 89 keys keyboard, 1 for a 101/102 keys keyboard).

HeaderLocation The corresponding layout table offset (0-based).

So, to find a specific keyboard layout, we have to (1) read the first four bytes to find the index table and (2) locate the specified index entry (Country, SubCountryID, XTableID and keyboard type). If such an entry exists, its HeaderLocation field contains the Keyboard Layout Table Entry offset.
The Keyboard Layout Table Entry
Each keyboard layout table entry contains a header, followed by key and accent definitions. The header is as follows:
/* code page header */ typedef struct XHeader { WORD XTableID; /* code page number */ /* note: 32-bit wide field */ struct { /* which shift key or key combo affects Char3 of each KeyDef */ BITFIELD ShiftAlt :1; /* use shift-alt instead of ctrl-alt */ BITFIELD AltGrafL :1; /* use left alt key as alt- graphics */ BITFIELD AltGrafR :1; /* use right alt key as alt- graphics */ /* other modifiers */ BITFIELD ShiftLock :1; /* treat caps lock as shift lock */ BITFIELD DefaultTable :1; /* default table for the language */ BITFIELD ShiftToggle :1; /* TRUE:. toggle, FALSE:. latch shiftlock */ BITFIELD AccentPass :1; /* TRUE:. pass on accent keys and beep, FALSE:. just beep */ BITFIELD CapsShift :1; /* caps-shift uses Char5 */ BITFIELD MachDep :1; /* machine-dependent table */ /* Bidirectional modifiers */ BITFIELD RTL :1; /* Right-To-Left orientation */ BITFIELD LangSel :1; /* TRUE:. National language layout FALSE:. English language layout */ /* default layout indicator */ BITFIELD DefaultLayout:1; /* default layout for the country */ } XTableFlags1; WORD KbdType; /* keyboard type */ WORD KbdSubType; /* keyboard sub-type */ WORD XtableLen; /* length of table */ WORD EntryCount; /* number of KeyDef entries */ WORD EntryWidth; /* width in bytes of KeyDef entries */ BYTE Country[2]; /* country ID, i.e. "US" */ WORD TableTypeID; /* Table type, 0001=OS/2 */ BYTE SubCountryID[4]; /* sub-country ID, ASCII, i.e. "153 " */ WORD Reserved[8]; } XHeader;
Figure 2. The Table header structure.

Field Description

XTableID The keyboard layout codepage.

XTableFlags Layout's flags (see the Layout Flags subsection below).

KbdType The keyboard type (0 = 89 keys, 1 = 101/102 keys).

KbdSubType The keyboard subtype (??? 0).

XtableLen The table length. The length (in bytes) includes this header.

EntryCount The number of KeyDef entries.

EntryWidth The width in bytes of KeyDef entries.

Country The country ID (bytes reversed, that is, you got " SU" for US).

TableTypeID The table type (1 for OS/2).

SubCountryID The subcountry ID ("153 ", "189 ", "120 ", ...).

Reserved Unknown.

This table header is followed by EntryCount KeyDef entries. Each KeyDef entry is as follows:
/* Key definition, one per scan code in table */ typedef struct { WORD XlateOp; BYTE Char1; BYTE Char2; BYTE Char3; BYTE Char4; BYTE Char5; } KeyDef;
Figure 3. The KeyDef structure.

Field Description

XlateOP The 9 lower bits specify the key type (see the key type subsection below).
The high 7 bits specify which "accent key" is allowed.
Note: if there's more than seven accent keys, and if an accent key with an ID greater than 7 is allowed, the seventh bit of the high 7 bits will be set and we will have to check the corresponding Accent Table Entry to find out the validity of the combination.

char1 The "standard" value

char2 The "shifted" value

char3 The "Alted" value

char4

char5

The specific meaning of the charx fields depends on the XlateOP value, as explained in the key type subsection. The default value of the EntryWidth field (in the header) is 7, but, if this value is bigger, then, there are additional charx fields in the KeyDef structure. (Namely, you have EntryWidth - sizeof(XlateOP) charx fields, with sizeof(XlateOP) being 2.)
These EntryCount KeyDef entries are then followed by the Accent Table, which contains the seven Accent Table Entries (one per possible accent -- if there's more than seven accents, the seventh entry contains the additional entries).
Each Accent Table Entry is as follows:
/* Accent Table Entry, one per accent, up to seven accent */ typedef struct { CHAR charOrg; /* The key's ASCII value, i.e. "a" */ CHAR charRes; /* The resulting ASCII value, i.e. "…" */ } TRANS; typedef struct { BYTE AccentGlyph; /* What to show while waiting for a key */ BYTE byte1; BYTE byte2; BYTE byte3; BYTE byte4; BYTE byte5; TRANS aTrans[20]; /* The allowed substitutions */ } AccentTableEntry;
Figure 4. The AccentTableEntry structure.
The seventh entry has a slightly different format. If there's more than 6 accents, its first byte contains the length of the seventh Accent Table Entry. This entry is then followed by a byte whose contents is the length of the eighth entry, and so on:

Figure 5. The seventh Accent Table Entry structure.
There's no "end of entry" indicator. Use the XTableLen field to check it:
AccentTableEntryLen = XTableLen - sizeof(XHeader) - EntryCount * EntryWidth.

The first six entries take 6*sizeof(AccentTableEntry) = 276 bytes. The remaining Accent Table entries fit in AccentTableEntryLen-276 bytes. When the sum of the size of the additional entries (that is, l7 + l8 + ...) reaches this value, it's done.
If there's less than seven accents, the first byte of the seventh entry is 0x00.
So, the accent-key+ASCII-key translation process is quite easy: when an accent key is pressed, just remember the accent code, and optionally display the corresponding glyph (AccentGlyph). Then, wait for another key to be pressed. If this key accepts the remembered accent (that is, the corresponding bit in the 7 high bits of XlateOP is set), locate the corresponding charRes in the aTrans array of the Accent Table Entry (yes, you'll have to browse this array until you find the right charOrg!). If the pressed key does not accept the remembered accent (or if you can't find the corresponding charOrg in aTrans), just beep. You're done!
Layout Flags
Various flags describe the layout's behavior.

Flag Description

ShiftAlt When this flag is 1, it allows you to use "Shift+Alt" instead of "Ctrl+Alt" when accessing the third glyph of a key. (With 89-keys keyboards.)

AltGrL When this flag is 1, the left "Alt" key is used for "AltGr". (With 101/102-keys keyboards.)

AltGrR When this flag is 1, the right "Alt" key is used for "AltGr". (With 101/102-keys keyboards.)

ShiftLock When this flag is 1, "CapsLock" acts as a "ShiftLock" key. That is, when CapsLock is ON, pressing a "Shift" key unset it. When this flag is 0, pressing a "Shift" key temporarily toggle the CapsLock state, but it is restored when releasing the "Shift" key.

DefaultTable When this flag is 1, the layout uses the default country codepage.

ShiftToggle With 89-keys keyboards, set this flag in conjunction with ShiftLock. That is, when ShiftLock is 1, set ShiftToggle to 1, and when ShiftLock is 0, set ShiftToggle to 0. On 101/102-keys keyboards, set this flag to 0.

AccentPass When this flag is set to 1, accents keys (aka. dead keys) are allowed.

CapsShift Unknown. It's 1 for all Swiss keyboards, 0 otherwise.

MachDep Set this flag to 1 when there's more than one physical layout sharing the same country code. See the DefaultLayout flag below.

RTL When this flag is 1, the layout use the Bidirectional Languages support. (RTL stands for Right-to-Left.)

LangSel When this flag is 1, the layout is a National one (Arabic or Hebrew, usually). When this flag is 0, the layout is an English one. Note: When RTL is 0, this flag is not used - set it to 0.

DefaultLayout When MachDep is 1, set this flag to 1 to denote the fact that this layout is the default one. Otherwise, set it to 0.

Key Type
The "Key type" value specifies the meaning of the charx fields in the KeyDef entries.
Note: In the following table, "xxx'ed'" means holding down xxx while pressing the key. And, if an entry contains "???", well, its meaning is not completely known... Otherwise, and if char3 is not null (in which case nothing is produced), it's a "normal" symbol.

Value Signification

0 An empty entry. That is, no key produces this scan code.

0x01 AlphaKey. This is an alphabetical key. The char1 field contains the unshifted key value. The char2 field contains the shifted key value. If the "accent" bits are not null, they specify the allowed accents.
Each AlphaKey can generates a "Ctrl'ed" value, when used in conjunction with a "Ctrl" key. In this case, the generated value is char1-96.

0x02 SpecKey. This key generates an unshifted value (char1) and a shifted value (char2). It does not generates an "Alted", "AltGr'ed" or "Ctrl'ed" value.

0x03 SpecKeyC. This key generates can generate a value when "AltGr'ed". char3 contains this (optional) value. If char3 is non null, then, it's the value. If char3 is less than 32, the value is an accent. Otherwise, it's a "normal" symbol.
char1 and char2 contain the unshifted and shifted key value. When CapsLock is ON, the order is reversed. That is, the unshifted value is char2 while the shifted value is char1.

0x04 SpecKeyA. This key can generate a value (char3) when "AltGr'ed". char1 and char2 contain the unshifted and the shifted key value, respectively. It does not depend on the CapsLock value (that is, char1 is always the unshifted value while char2 is always the shifted value, whether CapsLock is ON or not).
If char3 is less than 32, the value is a "control" code. It's not an accent (compare with the previous key type, SpecKeyC).

0x05 SpecKeyCA ???

0x06 FuncKey. The char1 field contains the function key number. All other fields contain 0.

0x07 PadKey. This is a "NumPad" key. The char1 field contains the padkey indices (0 = "7", 1="8", 2="9", 3="-", 4="4", 5="5", 6="6", 7="+", 8="1", 9="2", 10="3", 11="0" and 12 = ".").
Note: This follows the "old" keyboard (89 keys) layout.
The char2 field contains the ASCII character. All other fields contains 0.

0x08 SpecCtlKey. This keys generates "control" code (that is, ASCII code in range 0..31). char1 contains the unshifted control code, while char2 contains the shifted control code. All other fields contain 0.

0x09 The PrtSc key.

0x0a The SysReq key.

0x0b AccentKey. This key generates "accent" code. char1 is the unshifted accent (in range 1..7) and char2 is the shifted accent (also in range 1..7). char5 has the value of char1. If char3 is not null, it's the value generated when "alted". char4 is 0.

0x0c ShiftKey. A shift or control key. If char1 is 0x1 it's the right "Shift" key. It's the left one if char1 is 0x2. If char1 is 0x4, it's a "Ctrl" key. In this case, char2 is 0x1 and char3 is 0x4, and char4 & char5 are 0. Otherwise, char2..char5 are 0.

0x0d ToggleKey ???

0x0e The Alt key.

0x0f The NumLock key.

0x10 The CapsLock key.

0x11 The ScrollLock key.

0x12 XShiftKey ???

0x13 XToggleKey ???

0x14

0x15 SpecKeyAS ??? (my guess: it's like SpecKeyCS, except that char3 is a "control" code (that is, an ASCII value in range 1..31).

0x1a ExtExtKey. The new "cursor" keys. That is, the keys which are missing in a 89-keys keyboard.

<other> unknown

ScanCode to Key Value Conversion

In this section, we'll describe the keyboard scancode to ASCII char conversion scheme. Doing such a conversion is required when you want to write your own keyboard handler, or when, for any other reason, you have to deal directly with scancodes.
Portion of code will be given in REXX. Please refer to SHOWDCP.CMD for missing functions.
And, it's just a scheme, it's not a complete and fully-functional scancode to key value converter. <grin>
Determining the Required Keyboard Layout
The first thing to do is to load the correct keyboard layout. We first have to find the current Country/CodePage value by using the DosQueryCp/DosQueryCtryInfo functions (Refer to Control Program Guide and Reference for more information on those two functions).
We then have to find the current keyboard type - that is, an old (89 keys) one or a "new" one (101/102 keys). If you know how to do this, please, let me know!
The last step is to find the user's desired keyboard layout. The easiest way to do this is probably to scan the CONFIG.SYS, or to provide a command parameter. (We need both country abbrev and subcountry code.)
We could then load the corresponding keyboard layout.
/* Loading the keyboard layout ** ** rcp is current codepage ** rcn is current country abbrev (US, FR, ...) ** rss is current subcountry (153, 189, 120) ** rty is current keyboard type */ ito = readl() call charin infile,ito iec = readw() do iec call getindex if (country = rcn) & (rss = subcntr) & (rcp = cp) & (rty = type) then leave end /* do */ if (country \= rcn) | (rss \= subcntr) | (rcp \= cp) | (rty \= type) then do say "Keyboard layout not found!" exit end call getentry offset

Having read the layout header, we then have to read the corresponding KeyDefs:
do i = 1 to EntryCount call getkeydef /* here, we have to store is somewhere... */ ... end

And then come the last initialization step:
Determining the Accent Key Conversion Table
free = tablelen - 40 - entrycount * entrywidth j = 1 empty = 1 do while free > 0 call getaccententry j /* We here have to store it somewhere ... */ ... j = j + 1 free = free - len end /* do */

We are now ready...
Converting a Scancode to a Key Value
The first thing to do is to maintain some Boolean values containing various special keys status (CapsLock, NumLock, ScrollLock and Alt/Shift/Ctrl). We have to remember the last accent key pressed, too.
We then have to handle each key type.
/* scan is the current key scancode */ type = key.scan.keytype select /* One of the many "toggle" key */ when type = 'CAPSLOCK' then CapsLock = \CapsLock ... when type = 'ALPHAKEY' then do if \PendingAccent then select when CtrlPressed then code = key.scan.char1 - 96 when AltPressed then code = '00'x||scan when ShiftPressed & \CapsLock then code = key.scan.char2 when ShiftPressed then code = key.scan.char1 when CapsLock then code = key.scan.char2 otherwise code = key.scan.char1 end /* select */ else select when \allowedAccent() | CtrlPressed | AltPressed then call BEEP when ShiftPressed & \CapsLock then code = addAccent(key.scan.char2) when ShiftPressed then code = addAccent(key.scan.char1) when CapsLock then code = addAccent(key.scan.char2) otherwise code = addAccent(key.scan.char1) end end ... otherwise /* Not a known type! */ say 'Type 'type' unknown!' end

The complete implementation is, err, left as an exercise. <grin>
Using Keyboard Layouts

A very important note: be sure to have a safe copy of your original KEYBOARD.DCP before any experimentation! You've been warned. <grin>
In SHOWDCP.ZIP is included a REXX script which allows you to explores/modify keyboard layouts. It's usage is as follow:
showdcp usage Usage: showdcp [param] file [country [subcountry [cp [type]]]] [file2] -h - Access Help ; -v[n] - View matching layouts. n is the detail level ; -x - Extract matching layouts ; -a - Add layout to file ; -ds,t,c - Define a key ; -sk1,k2 - Swap key k1 and key k2. country = country code (US, FR, ...) or * subcountry = subcountry code (189, 120, ...) or * cp = code page (437, 850, ...) or * type = keyboard type (0 = 89 keys, 1 = 101/102 keys) or *
Figure 6. The showdcp command usage

Param Description

-h Shows usage information (see Figure 6);

-v[n] Displays keyboard layouts which match the given specification. n is the detail level, in range 0- 4 (1 is the default):
0 displays matching entry count,
1 adds Index Entry information,
2 adds Table Entry information,
3 adds KeyDefs definitions,
4 adds AccentTable definitions;

-x Extracts the matching layouts in file2;

-a

-ds,t,d Defines the key associated with scancode s. t is the key type and d is the definition. It's an hexadecimal string (see Example 3 below);

-sk1,k2 Swaps keys definition. k1 and k2 are the scancode to swap.

Example 1
If you want to create a restricted KEYBOARD.DCP file which contains all US layouts, but nothing else, enter the following commands:
showdcp -x c:\os2\keyboard.dcp US * * * dummy showdcp -a mylayout.dcp dummy

And then, replace the DEVINFO=KBD... line in your CONFIG.SYS with:
DEVINFO=KBD,US,D:\TMP\MYLAYOUT.DCP

Example 2
If you want to find all layouts which use the 863 (Canadian French) codepage, enter:
showdcp -v c:\os2\keyboard.dcp * * 863

You'll get something like:
Operating System/2 Keyboard.dcp file viewer Version 1.05.000 Jan 25 1995 (C) Copyright Martin Lafaix 1994, 1995 All rights reserved.

Example 3
If you want to change the definition of the "A" key in the standard French layout so that the key caps are reversed, enter:
copy c:\os2\keyboard.dcp mykbd.dcp showdcp -d16,1E05,4161000000 MYKBD.DCP FR 189 * 1

If you want to try the newly defined layout, and assuming your boot drive is "C:", enter:
copy c:\os2\keyboard.dcp keyboard.org copy mybkd.dcp c:\os2\keyboard.dcp keyb fr189

Then, experiment with it (with the French layout, the "A" key is on the US "Q" key). And, after that, restore your initial configuration:
keyb us copy keyboard.org c:\os2\keyboard.dcp

The "-d" parameter revisited
The "-d" parameter is immediately followed by the key scancode. It's a decimal number. It's then followed by a comma. The key type comes next. It's a 16bits hexadecimal value. Its 9 low bits contains the key type properly speaking, while the 7 high bits contain the allowed accents. The key type is followed by another comma, which is followed by the key definition. It's an hexadecimal string. The first two hexadecimal digits corresponds to the char1 field, and so on. In the previous example, we are assigning 0x41 to the char1 field ("A"), 0x61 ("a") to char2, and 0x00 to all remaining fields (char3, char4 and char5). If the key definition string does not defines all fields, the value of the non-specified fields is not modified. In the previous example, we could have used "4161" instead of "4161000000".
Be really careful when using the "-d" parameter.
Summary

Please tell me what you think!
I hope you find this article useful and informative. If you like what I have done, please let me know; if not, please tell me why. I will use your comments to make upcoming papers better.
Thank you!