ULSConvertCodepage: Difference between revisions

Latest revision as of 14:23, 14 August 2017

Converts a string from one codepage to another, including the Unicode UCS-2 encoding. (To convert to UCS-2, simply specify a target codepage of 1200; to convert from UCS-2, use a source codepage of 1200.)

A partial list of OS/2 codepages is at the bottom of this document.

Arguments

ULSConvertCodepage( string [, sourcecp][, targetcp][, subchar][, controls][, path] )

Parameters

string

The string to be converted (required).

sourcecp

The source codepage (a positive integer). This is the codepage with which <string> is encoded (i.e. under which it would display correctly). The default is the current process codepage.

targetcp

The target codepage (a positive integer). This is the codepage under which the returned string is to be encoded. The default is the current process codepage.

subchar

The substitution character for the target codepage. This is a two-letter hexadecimal value between 00 and FF which represents the character in the target codepage which will be used to represent substituted (i.e. unsupported) characters. The default value depends on the codepage; for most single-byte codepages it is 0x7F (?).

NOTE: Not all codepages appear to honour this setting!

controls

The control-byte mapping flag. This specifies how to convert those byte values which can represent either control codes or glyphs depending on the context: specifically, 0x00-0x19 and 0x7F. Only the first character is significant, and (if specified) must be one of the following values:

D data/control bytes: leave values unchanged; this is the default

G displayable glyphs: convert according to codepage like any other character

C control bytes: convert using standard IBM control mapping

L treat linebreaks (CR and LF) as control bytes, but all others as displayable glyphs

path

The path conversion flag. This only applies to DBCS codepages, and indicates whether or not <string> should be assumed to contain a path specification. Only the first character is significant, and (if specified) must be one of the following values:

Y yes, assume string contains a path; this is the default

N no, assume string doesn't contain a path

Returns

The converted string. If an error occurs during conversion, an empty string ("") is returned and the global ULSERR variable will be set to a non-zero value.

Example

Code

/* Input string (encoded for codepage 850) */
string = 'We had lunch at a caf‚ in Reykjav¡k.'
SAY '[Codepage 850]:' string

/* Convert it to codepage 862, using '?' for unsupported characters */
string2 = ULSConvertCodepage( string, 850, 862, '3f' )
IF ULSERR \= '0' THEN
    SAY ULSERR
ELSE
    SAY '[Codepage 862]:' string2

/* Convert it to codepage 1200 (UCS-2) */
string3 = ULSConvertCodepage( string, 850, 1200 )
IF ULSERR \= '0' THEN
    SAY ULSERR
ELSE
    SAY '[UCS-2]:       ' string3

Output

[Codepage 850]: We had lunch at a caf‚ in Reykjav¡k.
[Codepage 862]: We had lunch at a caf? in Reykjav?k.
[UCS-2]:         W e   h a d   l u n c h   a t   a   c a f é   i n   R e y k j a v í k .

@@ Line 1: / Line 1: @@
+Converts a string from one codepage to another, including the Unicode UCS-2 encoding. (To convert to UCS-2, simply specify a target codepage of 1200; to convert from UCS-2, use a source codepage of 1200.)
-==Description==
-Converts a string from one codepage to another, including the Unicode UCS-2 encoding.  (To convert to UCS-2, simply specify a target codepage of 1200; to convert from UCS-2, use a source codepage of 1200.)
 A partial list of OS/2 codepages is at the bottom of this document.
 ==Arguments==
+ ULSConvertCodepage( string [, sourcecp][, targetcp][, subchar][, controls][, path] )
-ULSConvertCodepage( string [, sourcecp][, targetcp][, subchar][, controls][, path] )
+===Parameters===
+;string:The string to be converted (required).
-    Parameters:
+;sourcecp:The source codepage (a positive integer). This is the codepage with which <string> is encoded (i.e. under which it would display correctly). The default is the current process codepage.
-      string    The string to be converted (required).
+;targetcp:The target codepage (a positive integer). This is the codepage under which the returned string is to be encoded. The default is the current process codepage.
+;subchar:The substitution character for the target codepage. This is a two-letter hexadecimal value between 00 and FF which represents the character in the target codepage which will be used to represent substituted (i.e. unsupported) characters. The default value depends on the codepage; for most single-byte codepages it is 0x7F ().
-      sourcecp  The source codepage (a positive integer).  This is the codepage
+:NOTE: Not all codepages appear to honour this setting!
-                with which <string> is encoded (i.e. under which it would
+;controls:The control-byte mapping flag.  This specifies how to convert those byte values which can represent either control codes or glyphs depending on the context: specifically, 0x00-0x19 and 0x7F.  Only the first character is significant, and (if specified) must be one of the following values:
-                display correctly).  The default is the current process
+::D  data/control bytes: leave values unchanged; this is the default
-                codepage.
+::G  displayable glyphs: convert according to codepage like any other character
+::C  control bytes: convert using standard IBM control mapping
-      targetcp  The target codepage (a positive integer).  This is the codepage
+::L  treat linebreaks (CR and LF) as control bytes, but all others as displayable glyphs
-                under which the returned string is to be encoded.  The default
+;path:The path conversion flag.  This only applies to DBCS codepages, and indicates whether or not <string> should be assumed to contain a path specification.  Only the first character is significant, and (if specified) must be one of the following values:
-                is the current process codepage.
+::Y  yes, assume string contains a path; this is the default
+::N  no, assume string doesn't contain a path
-      subchar   The substitution character for the target codepage.  This is a
-                two-letter hexadecimal value between 00 and FF which represents
-                the character in the target codepage which will be used to
-                represent substituted (i.e. unsupported) characters.  The
-                default value depends on the codepage; for most single-byte
-                codepages it is 0x7F ().
-                NOTE: Not all codepages appear to honour this setting!
-      controls  The control-byte mapping flag.  This specifies how to convert
-                those byte values which can represent either control codes or
-                glyphs depending on the context: specifically, 0x00-0x19 and
-x7F.  Only the first character is significant, and (if
-                specified) must be one of the following values:
-                  D  data/control bytes: leave values unchanged; this is the
-                     default
-                  G  displayable glyphs: convert according to codepage like
-                     any other character
-                  C  control bytes: convert using standard IBM control mapping
-                  L  treat linebreaks (CR and LF) as control bytes, but all
-                     others as displayable glyphs
-      path      The path conversion flag.  This only applies to DBCS codepages,
-                and indicates whether or not <string> should be assumed to
-                contain a path specification.  Only the first character is
-                significant, and (if specified) must be one of the following
-                values:
-                  Y  yes, assume string contains a path; this is the default
-                  N  no, assume string doesn't contain a path
 ==Returns==
-The converted string.  If an error occurs during conversion, an empty string ("") is returned and the global ULSERR variable will be set to a non-zero value.
+The converted string. If an error occurs during conversion, an empty string ("") is returned and the global ULSERR variable will be set to a non-zero value.
 ==Example==
-<PRE>
+Code
-       Code
+<code>
+ /* Input string (encoded for codepage 850) */
+ string = 'We had lunch at a caf‚ in Reykjav¡k.'
+ SAY '[Codepage 850]:' string
+ /* Convert it to codepage 862, using '?' for unsupported characters */
+ string2 = ULSConvertCodepage( string, 850, 862, '3f' )
+ IF ULSERR \= '0' THEN
+     SAY ULSERR
+ ELSE
+     SAY '[Codepage 862]:' string2
+ /* Convert it to codepage 1200 (UCS-2) */
+ string3 = ULSConvertCodepage( string, 850, 1200 )
+ IF ULSERR \= '0' THEN
+     SAY ULSERR
+ ELSE
+     SAY '[UCS-2]:       ' string3
+</code>
-        /* Input string (encoded for codepage 850) */
-        string = 'We had lunch at a caf‚ in Reykjav¡k.'
-        SAY '[Codepage 850]:' string
-        /* Convert it to codepage 862, using '?' for unsupported characters */
-        string2 = ULSConvertCodepage( string, 850, 862, '3f' )
-        IF ULSERR \= '0' THEN
-            SAY ULSERR
-        ELSE
-            SAY '[Codepage 862]:' string2
-        /* Convert it to codepage 1200 (UCS-2) */
-        string3 = ULSConvertCodepage( string, 850, 1200 )
-        IF ULSERR \= '0' THEN
-            SAY ULSERR
-        ELSE
-            SAY '[UCS-2]:       ' string3
-</PRE>
 Output
+ [Codepage 850]: We had lunch at a caf‚ in Reykjav¡k.
+ [Codepage 862]: We had lunch at a caf? in Reykjav?k.
+ [UCS-2]:         W e   h a d   l u n c h   a t   a   c a f é   i n   R e y k j a v í k .
-        [Codepage 850]: We had lunch at a caf‚ in Reykjav¡k.
+[[Category:RxULS]]
-        [Codepage 862]: We had lunch at a caf? in Reykjav?k.
-        [UCS-2]:         W e   h a d   l u n c h   a t   a   c a f é   i n   R e y k j a v í k .
-[[Category:The OS/2 API Project]]