You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We inherited this issue from PostgreSQL.
PostgreSQL uses glibc to sort strings. In version glibc=2.28, collations
broke down badly (in general, there are no guarantees when updating glibc).
Changing collations breaks indexes. Similarly, a cluster with different
collations also behaves unpredictably.
What and when something has changed in glibc can be found
on https://github.com/ardentperf/glibc-unicode-sorting
Also there is special postgresql-wiki https://wiki.postgresql.org/wiki/Locale_data_changes
And you tube video https://www.youtube.com/watch?v=0E6O-V8Jato
In short, the issue can be seen through the use of bash:
( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.UTF-8 sort
gives the different results in ubunru 18.04 and 22.04.
There is no way to solve the problem other than by not changing the symbol order.
We freeze symbol order and use it instead of glibc.
Here the solution https://github.com/postgredients/mdb-locales.
In this PR I have added PostgreSQL patch that replaces all glibc
locale-related calls with a calls to an external libary. It activates
using new configure parameter --with-mdblocales, which is off by
default.
Using custom locales needs libmdblocales1 package and mdb-locales
package with symbol table.
Build needs libmdblocales-dev package with headers.
0 commit comments