Unicode overview
- Last Updated: March 30, 2020
- 1 minute read
- OpenEdge
- Version 12.2
- Documentation
An evolving standard, Unicode defines a single code page that includes most symbols—letters, ideograms, syllabics (such as the Japanese Kana symbols), punctuation, diacritics, mathematical symbols, technical symbols, and so on—from most of the languages of the world, and assigns each symbol a numeric value—originally, a number between zero and 65,535, the range of an unsigned 16-bit integer.
As it turned out, Unicode's original limit of 65,536 symbols
proved too small, and the limit was extended to well over 1,000,000
symbols. Several ways of encoding each symbol were defined, and
the encodings were designed so that you can convert from one to
another any number of times without losing any information. For
more information on the algorithms for converting between encodings,
see the Unicode Web site, http://www.unicode.org.
OpenEdge supports Unicode's UTF-8 encoding. In addition, all varieties
of UTF-16 and UTF-32 are supported for input and output and for LONGCHARs
and CLOBs.