Jump to content

ULSConvertCodepage: Difference between revisions

From EDM2
Ak120 (talk | contribs)
Ak120 (talk | contribs)
mNo edit summary
 
Line 1: Line 1:
 
Converts a string from one codepage to another, including the Unicode UCS-2 encoding. (To convert to UCS-2, simply specify a target codepage of 1200; to convert from UCS-2, use a source codepage of 1200.)
==Description==
Converts a string from one codepage to another, including the Unicode UCS-2 encoding. (To convert to UCS-2, simply specify a target codepage of 1200; to convert from UCS-2, use a source codepage of 1200.)


A partial list of OS/2 codepages is at the bottom of this document.
A partial list of OS/2 codepages is at the bottom of this document.


==Arguments==
==Arguments==
ULSConvertCodepage( string [, sourcecp][, targetcp][, subchar][, controls][, path] )


ULSConvertCodepage( string [, sourcecp][, targetcp][, subchar][, controls][, path] )
===Parameters===
 
;string:The string to be converted (required).
    Parameters:
;sourcecp:The source codepage (a positive integer). This is the codepage with which <string> is encoded (i.e. under which it would display correctly). The default is the current process codepage.
      string   The string to be converted (required).
;targetcp:The target codepage (a positive integer). This is the codepage under which the returned string is to be encoded. The default is the current process codepage.
 
;subchar:The substitution character for the target codepage. This is a two-letter hexadecimal value between 00 and FF which represents the character in the target codepage which will be used to represent substituted (i.e. unsupported) characters. The default value depends on the codepage; for most single-byte codepages it is 0x7F ().
      sourcecp The source codepage (a positive integer). This is the codepage
:NOTE: Not all codepages appear to honour this setting!
                with which <string> is encoded (i.e. under which it would
;controls:The control-byte mapping flag.  This specifies how to convert those byte values which can represent either control codes or glyphs depending on the context: specifically, 0x00-0x19 and 0x7F.  Only the first character is significant, and (if specified) must be one of the following values:
                display correctly). The default is the current process
::D  data/control bytes: leave values unchanged; this is the default
                codepage.
::G  displayable glyphs: convert according to codepage like any other character
 
::C  control bytes: convert using standard IBM control mapping
      targetcp The target codepage (a positive integer). This is the codepage
::L  treat linebreaks (CR and LF) as control bytes, but all others as displayable glyphs
                under which the returned string is to be encoded. The default
;path:The path conversion flag.  This only applies to DBCS codepages, and indicates whether or not <string> should be assumed to contain a path specification.  Only the first character is significant, and (if specified) must be one of the following values:
                is the current process codepage.
::Y  yes, assume string contains a path; this is the default
 
::N  no, assume string doesn't contain a path
      subchar   The substitution character for the target codepage. This is a
                two-letter hexadecimal value between 00 and FF which represents
                the character in the target codepage which will be used to
                represent substituted (i.e. unsupported) characters. The
                default value depends on the codepage; for most single-byte
                codepages it is 0x7F ().
 
                NOTE: Not all codepages appear to honour this setting!


      controls  The control-byte mapping flag.  This specifies how to convert
                those byte values which can represent either control codes or
                glyphs depending on the context: specifically, 0x00-0x19 and
                0x7F.  Only the first character is significant, and (if
                specified) must be one of the following values:
                  D  data/control bytes: leave values unchanged; this is the
                    default
                  G  displayable glyphs: convert according to codepage like
                    any other character
                  C  control bytes: convert using standard IBM control mapping
                  L  treat linebreaks (CR and LF) as control bytes, but all
                    others as displayable glyphs
      path      The path conversion flag.  This only applies to DBCS codepages,
                and indicates whether or not <string> should be assumed to
                contain a path specification.  Only the first character is
                significant, and (if specified) must be one of the following
                values:
                  Y  yes, assume string contains a path; this is the default
                  N  no, assume string doesn't contain a path
==Returns==
==Returns==
The converted string. If an error occurs during conversion, an empty string ("") is returned and the global ULSERR variable will be set to a non-zero value.
The converted string. If an error occurs during conversion, an empty string ("") is returned and the global ULSERR variable will be set to a non-zero value.


==Example==
==Example==
<PRE>
Code
       Code
<code>
/* Input string (encoded for codepage 850) */
string = 'We had lunch at a caf‚ in Reykjav¡k.'
SAY '[Codepage 850]:' string
/* Convert it to codepage 862, using '?' for unsupported characters */
string2 = ULSConvertCodepage( string, 850, 862, '3f' )
IF ULSERR \= '0' THEN
    SAY ULSERR
ELSE
    SAY '[Codepage 862]:' string2
/* Convert it to codepage 1200 (UCS-2) */
string3 = ULSConvertCodepage( string, 850, 1200 )
IF ULSERR \= '0' THEN
    SAY ULSERR
ELSE
    SAY '[UCS-2]:       ' string3
</code>


        /* Input string (encoded for codepage 850) */
        string = 'We had lunch at a caf‚ in Reykjav¡k.'
        SAY '[Codepage 850]:' string
        /* Convert it to codepage 862, using '?' for unsupported characters */
        string2 = ULSConvertCodepage( string, 850, 862, '3f' )
        IF ULSERR \= '0' THEN
            SAY ULSERR
        ELSE
            SAY '[Codepage 862]:' string2
        /* Convert it to codepage 1200 (UCS-2) */
        string3 = ULSConvertCodepage( string, 850, 1200 )
        IF ULSERR \= '0' THEN
            SAY ULSERR
        ELSE
            SAY '[UCS-2]:      ' string3
</PRE>
Output
Output
[Codepage 850]: We had lunch at a caf‚ in Reykjav¡k.
[Codepage 862]: We had lunch at a caf? in Reykjav?k.
[UCS-2]:        W e  h a d  l u n c h  a t  a  c a f é  i n  R e y k j a v í k .


        [Codepage 850]: We had lunch at a caf‚ in Reykjav¡k.
[[Category:RxULS]]
        [Codepage 862]: We had lunch at a caf? in Reykjav?k.
        [UCS-2]:        W e  h a d  l u n c h  a t  a  c a f é  i n  R e y k j a v í k .
 
[[Category:The OS/2 API Project]]

Latest revision as of 14:23, 14 August 2017

Converts a string from one codepage to another, including the Unicode UCS-2 encoding. (To convert to UCS-2, simply specify a target codepage of 1200; to convert from UCS-2, use a source codepage of 1200.)

A partial list of OS/2 codepages is at the bottom of this document.

Arguments

ULSConvertCodepage( string [, sourcecp][, targetcp][, subchar][, controls][, path] )

Parameters

string
The string to be converted (required).
sourcecp
The source codepage (a positive integer). This is the codepage with which <string> is encoded (i.e. under which it would display correctly). The default is the current process codepage.
targetcp
The target codepage (a positive integer). This is the codepage under which the returned string is to be encoded. The default is the current process codepage.
subchar
The substitution character for the target codepage. This is a two-letter hexadecimal value between 00 and FF which represents the character in the target codepage which will be used to represent substituted (i.e. unsupported) characters. The default value depends on the codepage; for most single-byte codepages it is 0x7F (?).
NOTE: Not all codepages appear to honour this setting!
controls
The control-byte mapping flag. This specifies how to convert those byte values which can represent either control codes or glyphs depending on the context: specifically, 0x00-0x19 and 0x7F. Only the first character is significant, and (if specified) must be one of the following values:
D data/control bytes: leave values unchanged; this is the default
G displayable glyphs: convert according to codepage like any other character
C control bytes: convert using standard IBM control mapping
L treat linebreaks (CR and LF) as control bytes, but all others as displayable glyphs
path
The path conversion flag. This only applies to DBCS codepages, and indicates whether or not <string> should be assumed to contain a path specification. Only the first character is significant, and (if specified) must be one of the following values:
Y yes, assume string contains a path; this is the default
N no, assume string doesn't contain a path

Returns

The converted string. If an error occurs during conversion, an empty string ("") is returned and the global ULSERR variable will be set to a non-zero value.

Example

Code

/* Input string (encoded for codepage 850) */
string = 'We had lunch at a caf‚ in Reykjav¡k.'
SAY '[Codepage 850]:' string

/* Convert it to codepage 862, using '?' for unsupported characters */
string2 = ULSConvertCodepage( string, 850, 862, '3f' )
IF ULSERR \= '0' THEN
    SAY ULSERR
ELSE
    SAY '[Codepage 862]:' string2

/* Convert it to codepage 1200 (UCS-2) */
string3 = ULSConvertCodepage( string, 850, 1200 )
IF ULSERR \= '0' THEN
    SAY ULSERR
ELSE
    SAY '[UCS-2]:       ' string3

Output

[Codepage 850]: We had lunch at a caf‚ in Reykjav¡k.
[Codepage 862]: We had lunch at a caf? in Reykjav?k.
[UCS-2]:         W e   h a d   l u n c h   a t   a   c a f é   i n   R e y k j a v í k .