Hacker News new | past | comments | ask | show | jobs | submit | baur's comments login

CrateDB might be a good fit for full text and vector search (it’s SQL database but has dedicated clauses for FT and VS).

Curious how do you use PG for key/value and queue - do you use regular tables or some specific extensions?

I can imagine kv being a table with primary key on “key” and for queue a table with generated timestamp, indexed by this column and peek/add utilising that index.


Congrats on the launch!

Just wondering, do you have any plans to support CrateDB?

It supports SQL and understands PG protocol - perhaps supporting Postgres kinda already makes it close.


Trie also can be used as a hashset/map, roughly it's like hash(obj).toString is a "word" for a trie and in the leaf we store the object.

It's https://en.wikipedia.org/wiki/Hash_array_mapped_trie which used in Scala's immutableMap https://dotty.epfl.ch/api/scala/collection/immutable/HashMap...


Nice implementation!

There is an option to get all suffixes without traversing subtree, but it comes with extra O(N) memory where N is combined length of all stored words - depending on case might be acceptable since memory for storing words itself is O(N) anyway. https://stackoverflow.com/a/29966616/2104560 (update 1 and update 3)


Thanks! I just realized it doesn't work with strings containing underscores, but simply using __ instead of _ (so double underscores) as the key to end-of-word markings fixes that.

And thanks for the link, that is an interesting optimization!

EDIT: one fun non-practical application (histogramming the letters in a word is simpler and faster) is an anagram finder using prime numbers:

https://observablehq.com/@jobleonard/finding-anagrams-using-...


That's a nice idea. I guess it's better to stay in primitive type range (to avoid long arithmetics), so we can "compress" up to 15 items into a single prime_product making it less than 2^64 - for English words should be just fine, I don't expect many words with 16 and more letters.

UPD: Sorry, "up to 15" is a wrong phrasing. I checked once how "prime factorial" fits into primitive, and first 15 primes can fit into long. So it's possible to handle even more symbols if it's smth like "aaaaaaa"64 times because it would be just 2^64


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: