Thank you for your generous offer of help! I look forward to taking it up (may take a while as I'm about to move countries and quarantine).
In particular I love that one of the examples in your comment history is in Latin as that language is not currently supported by Postgres FTS. Are Latin and Ancient Greek supported by Manticore? (dare I hope for Anglo Saxon...)
In terms of advanced NLP (stemming, lemmatization, stopwords, wordforms) - no. In terms of just general tokenization - I've never dealt with Latin and Ancient Greek characters (if there're specific characters for those languages), but if even they are not supported by default it's not a problem to add them in config (https://mnt.cr/charset_table)
There's one big list of mappings for almost every script under the sun, including Greek. (With mappings like 'U+1F08..U+1F0F->U+1F00..U+1F07' turning U+1F08 Ἀ [CAPITAL ALPHA WITH PSILI] into U+1F00 ἀ [SMALL ALPHA WITH PSILI], and the same for seven other accented alphas. I've considered turning them all into unaccented alpha instead, but I don't know enough about Greek orthography to decide that.) https://github.com/Tatoeba/tatoeba2/blob/3170f7326ad2939c691...
It's not beautiful, but it's used in production. People who don't need to support quite as many languages as Tatoeba will probably want a simpler config, but it might still be useful as a reference.
In particular I love that one of the examples in your comment history is in Latin as that language is not currently supported by Postgres FTS. Are Latin and Ancient Greek supported by Manticore? (dare I hope for Anglo Saxon...)