Default word-break behavior of characters in multi-byte code pages
- Last Updated: March 30, 2020
- 1 minute read
- OpenEdge
- Version 12.2
- Documentation
Default word-break behavior of characters in multi-byte code pages
The following table describes the default word-break behavior of characters in multi-byte code pages. The table assumes word-break tables are Version 9 Type 3. For more information on word-break tables, see Word-break tables.
Note: The default word-break behavior can be changed
only for single-byte characters.
| If the code page is... | And the characters are... | The characters behave (by default)... |
|---|---|---|
| Double byte | Single byte | Depending on whether they are alphabetic or nonalphabetic. This is specified in the code page's character-attribute table.To change the default word-break behavior, supply a word-break table input file. |
| Double byte | Double byte | As separate words. |
| UTF-8 | Single byte | Depending on whether they are alphabetic or nonalphabetic. This is specified in the code page's character-attribute table.To change the default word-break behavior, supply a word-break table input file. |
| UTF-8 | Two-byte UTF-8 | Corresponding to the USE_IT word-delimiter attribute. |
| UTF-8 | Three- and four-byte UTF-8 | As separate words. |
For more information on character-attribute tables, see Character attribute tables. For more information on modifying word-break tables, see Create and modify word-break tables. For more information on word-delimiter attributes, see Word-delimiter attributes.