Character precedence in Unicode and Java Collator
- Last Updated: October 22, 2025
- 1 minute read
- Corticon
- Documentation
The Unicode standard assigns a 4 digit (hexadecimal) code to every
character, including many that can't be typed on standard keyboards. Java (and hence
Progress Corticon software) uses a special method
named Collator to sort these characters in specific
sequences based on the I18n locale of the user.
While sorting by locale allows for regional variations of language-specific characters like accents, the combination of these two systems can also make determining character precedence very complicated. The Unicode code and Java Collator sequence for standard keyboards in US-English locale is shown in the table below.
Sequences for other languages and/or locales may differ, and many other Unicode characters are available but are not shown in the table. We recommend http://www.unicode.org/charts for more information on the Unicode system and http://java.sun.com/docs/books/tutorial/i18n/text/locale.html for more information on the Java Collator method.
'Z'='z'evaluates tofalse.'C & S' < 'C and S'evaluates totruebecause characterahas a higher precedence than&(26 < 44). These characters are decisive because they are the first different characters encountered as the two strings are compared beginning with characters in position 1.'B' > 'aardvark'evaluates totruebecause characterBhas a higher precedence thana(45 > 44).'Marilynn' < 'Marilyn'evaluates tofalsebecause characternhas a higher precedence than<space>(57 > 1). The first seven characters of each String are identical, so the final character comparison is decisive.- NOTE: Encoding special characters in string literals is not documented here, as a list of all chars to escape with a backslash is easily searched on the internet.
| character | name | precedence | Unicode 5.0 code |
|---|---|---|---|
| typed space | 1 | 0020 | |
| - | dash or minus sign | 2 | 002D |
| _ | underline or underscore | 3 | 005F |
| , | comma | 4 | 002C |
| ; | semicolon | 5 | 003B |
| : | colon | 6 | 003A |
| ! | exclamation point | 7 | 0021 |
| ? | question mark | 8 | 003F |
| / | slash | 9 | 002F |
| . | period | 10 | 002E |
| ` | grave accent | 11 | 0060 |
| ^ | circumflex | 12 | 005E |
| ~ | tilde | 13 | 007E |
| ' | apostrophe | 14 | 0027 |
| " | quotation marks | 15 | 0022 |
| ( | left parenthesis | 16 | 0028 |
| ) | right parenthesis | 17 | 0029 |
| [ | left bracket | 18 | 005B |
| ] | right bracket | 19 | 005D |
| { | left brace | 20 | 007B |
| } | right brace | 21 | 007D |
| @ | at symbol | 22 | 0040 |
| $ | dollar sign | 23 | 0024 |
| * | asterisk | 24 | 002A |
| \ | backslash | 25 | 005C |
| & | ampersand | 26 | 0026 |
| # | number sign or hash sign | 27 | 0023 |
| % | percent sign | 28 | 0025 |
| + | plus sign | 29 | 002B |
| < | less than sign | 30 | 003C |
| = | equals sign | 31 | 003D |
| > | greater than sign | 32 | 003E |
| | | vertical line | 33 | 007C |
| 0..9 | numbers 1 through 9 | 34-43 | 0031-0039 |
| a, A | letter a, small and capital | 44 | 0061, 0041 |
| b, B | letter b, small and capital | 45 | 0062, 0042 |
| c, C | letter c, small and capital | 46 | 0063, 0043 |
| d, D | letter d, small and capital | 47 | 0064, 0044 |
| e, E | letter e, small and capital | 48 | 0065, 0045 |
| f, F | letter f, small and capital | 49 | 0066, 0046 |
| g, G | letter g, small and capital | 50 | 0067, 0047 |
| h, H | letter h, small and capital | 51 | 0068, 0048 |
| I, I | letter I, small and capital | 52 | 0069, 0049 |
| j, J | letter j, small and capital | 53 | 006A, 004A |
| k, K | letter k, small and capital | 54 | 006B, 004B |
| l, L | letter l, small and capital | 55 | 006C, 004C |
| m, M | letter m, small and capital | 56 | 006D, 004D |
| n, N | letter n, small and capital | 57 | 006E, 004E |
| o, O | letter o, small and capital | 58 | 006F, 004F |
| p, P | letter p, small and capital | 59 | 0070, 0050 |
| q, Q | letter q, small and capital | 60 | 0071, 0051 |
| r, R | letter r, small and capital | 61 | 0072, 0052 |
| s, S | letter s, small and capital | 62 | 0073, 0053 |
| t, T | letter t, small and capital | 63 | 0074, 0054 |
| u, U | letter u, small and capital | 64 | 0075, 0055 |
| v, V | letter v, small and capital | 65 | 0076, 0056 |
| w, W | letter w, small and capital | 66 | 0077, 0057 |
| x, X | letter x, small and capital | 67 | 0078, 0058 |
| y, Y | letter y, small and capital | 68 | 0079, 0059 |
| z, Z | letter z, small and capital | 69 | 007A, 005A |