The Unicode standard assigns a 4 digit (hexadecimal) code to every character, including many that can't be typed on standard keyboards. Java (and hence Progress Corticon software) uses a special method named Collator to sort these characters in specific sequences based on the I18n locale of the user.

While sorting by locale allows for regional variations of language-specific characters like accents, the combination of these two systems can also make determining character precedence very complicated. The Unicode code and Java Collator sequence for standard keyboards in US-English locale is shown in the table below.

Sequences for other languages and/or locales may differ, and many other Unicode characters are available but are not shown in the table. We recommend http://www.unicode.org/charts for more information on the Unicode system and http://java.sun.com/docs/books/tutorial/i18n/text/locale.html for more information on the Java Collator method.

  • 'Z'='z' evaluates to false.
  • 'C & S' < 'C and S' evaluates to true because character a has a higher precedence than & (26 < 44). These characters are decisive because they are the first different characters encountered as the two strings are compared beginning with characters in position 1.
  • 'B' > 'aardvark' evaluates to true because character B has a higher precedence than a (45 > 44).
  • 'Marilynn' < 'Marilyn' evaluates to false because character n has a higher precedence than <space> (57 > 1). The first seven characters of each String are identical, so the final character comparison is decisive.
  • NOTE: Encoding special characters in string literals is not documented here, as a list of all chars to escape with a backslash is easily searched on the internet.
character name precedence Unicode 5.0 code
typed space 1 0020
- dash or minus sign 2 002D
_ underline or underscore 3 005F
, comma 4 002C
; semicolon 5 003B
: colon 6 003A
! exclamation point 7 0021
? question mark 8 003F
/ slash 9 002F
. period 10 002E
` grave accent 11 0060
^ circumflex 12 005E
~ tilde 13 007E
' apostrophe 14 0027
" quotation marks 15 0022
( left parenthesis 16 0028
) right parenthesis 17 0029
[ left bracket 18 005B
] right bracket 19 005D
{ left brace 20 007B
} right brace 21 007D
@ at symbol 22 0040
$ dollar sign 23 0024
* asterisk 24 002A
\ backslash 25 005C
& ampersand 26 0026
# number sign or hash sign 27 0023
% percent sign 28 0025
+ plus sign 29 002B
< less than sign 30 003C
= equals sign 31 003D
> greater than sign 32 003E
| vertical line 33 007C
0..9 numbers 1 through 9 34-43 0031-0039
a, A letter a, small and capital 44 0061, 0041
b, B letter b, small and capital 45 0062, 0042
c, C letter c, small and capital 46 0063, 0043
d, D letter d, small and capital 47 0064, 0044
e, E letter e, small and capital 48 0065, 0045
f, F letter f, small and capital 49 0066, 0046
g, G letter g, small and capital 50 0067, 0047
h, H letter h, small and capital 51 0068, 0048
I, I letter I, small and capital 52 0069, 0049
j, J letter j, small and capital 53 006A, 004A
k, K letter k, small and capital 54 006B, 004B
l, L letter l, small and capital 55 006C, 004C
m, M letter m, small and capital 56 006D, 004D
n, N letter n, small and capital 57 006E, 004E
o, O letter o, small and capital 58 006F, 004F
p, P letter p, small and capital 59 0070, 0050
q, Q letter q, small and capital 60 0071, 0051
r, R letter r, small and capital 61 0072, 0052
s, S letter s, small and capital 62 0073, 0053
t, T letter t, small and capital 63 0074, 0054
u, U letter u, small and capital 64 0075, 0055
v, V letter v, small and capital 65 0076, 0056
w, W letter w, small and capital 66 0077, 0057
x, X letter x, small and capital 67 0078, 0058
y, Y letter y, small and capital 68 0079, 0059
z, Z letter z, small and capital 69 007A, 005A