Internationalize ABL Applications

Default word-break behavior of characters in multi-byte code pages

Save PDF

Default word-break behavior of characters in multi-byte code pages

Save PDF

Last Updated: March 30, 2020
1 minute read

OpenEdge
Version 12.2
Documentation

Default word-break behavior of characters in multi-byte code pages

The following table describes the default word-break behavior of characters in multi-byte code pages. The table assumes word-break tables are Version 9 Type 3. For more information on word-break tables, see Word-break tables.

Note: The default word-break behavior can be changed only for single-byte characters.

Table 1. Default word-break behavior of characters in multi-byte code pages
If the code page is...	And the characters are...	The characters behave (by default)...
Double byte	Single byte	Depending on whether they are alphabetic or nonalphabetic. This is specified in the code page's character-attribute table.To change the default word-break behavior, supply a word-break table input file.
Double byte	Double byte	As separate words.
UTF-8	Single byte	Depending on whether they are alphabetic or nonalphabetic. This is specified in the code page's character-attribute table.To change the default word-break behavior, supply a word-break table input file.
UTF-8	Two-byte UTF-8	Corresponding to the `USE_IT` word-delimiter attribute.
UTF-8	Three- and four-byte UTF-8	As separate words.

For more information on character-attribute tables, see Character attribute tables. For more information on modifying word-break tables, see Create and modify word-break tables. For more information on word-delimiter attributes, see Word-delimiter attributes.