Neat! I hope this goes far, it'd be great to have a faster/lighterweight Elastcsearch.
Something similar I'm really hoping to see is Tantivy in a Postgres extension, so I can stop playing the game of trying to keep my search engine and database in sync. Seeing pg-extend-rs (https://github.com/bluejekyll/pg-extend-rs) on HN the other week got me thinking about it again. Does anyone know whether this is feasible or if anyone is working on something in this vein?
Out of curiosity —- have you looked at using Postgresql’s full text search functionality to implement your search engine (e.g. [1])? If so, what do you get out of the combination of Postgres + Elasticsearch that you chose it over just the Postgres full text search?
Major problem with Postgres full-text search that those articles don't dwell into too much is that unless your documents are in one of the "chosen languages", you are more likely to find support for your language in search engine (like ElasticSearch) than get it on PostgreSQL.
You can convert existing dictionaries available to format Postgres understand, but this is annoying pain point if you happen to be an open source project like CMS or communication platform.
I don't get the hype about elasticsearch at all. Elasticsearch is more suited to searching logs. It doesn't have powerful sort functions, doesn't allow you to use multiple sort parameters etc.
Apache Solr is more suited to search. Lots of document filters, query filters, the index itself is highly configurable and the ability to sort on multiple parameters is great. LTR is also something too good to miss out on.
Lot of open source projects would benefit, if they had dedicated sales teams contacting firms day in, day out talking about their features.
People who run IT departments in the Enterprise world or even small firms lacking resources to keep up, just pick tools and make software decisions based on who reaches out to them.
Invest in a sales team and you penetrate markets that don't spend time monitoring developments in the tech world which is really the majority of all orgs. Elastic has done that quite well and is reaping the rewards.
Nonsense, Elasticsearch does all these things (and extremely well too). These are core lucene features. I'm not aware of any much that Solr does that Elasticsearch does not. And yes, I've used both.
Solr was there earlier. In many ways, Elasticsearch was a response to stuff Solr did not do, or did not do well. Like clustering for example. Solr has of course added this since then.
Can you explain what you mean by "multiple sort parameters" because it looks like you can to me in ES [1]. There is a well maintained LTR plugin for ES. Honestly Solr and ES are more similar than different. There are a few things Solr has which ES doesn't and the reverse is true too.
Last time I checked, PostgreSQL doesn't support any kind of tfidf ranking scheme because it doesn't track corpus frequencies of terms in the index. This impacts how well relevance ranking works for some workloads.
Just released a system that does the same - Postgres FTS can get you a long way. I've used Elasticsearch before, and it's a lot of complexity to add, and isn't always necessary.
There’s a super neat plugin for Postgres called “ZomboDB” that hooks up ElasticSearch to Postgres, so when you make full text queries, it gets performed on ES.
https://www.zombodb.com/
I had a brief play with it for a project at work and it was super straightforward to get running.
Something similar I'm really hoping to see is Tantivy in a Postgres extension, so I can stop playing the game of trying to keep my search engine and database in sync. Seeing pg-extend-rs (https://github.com/bluejekyll/pg-extend-rs) on HN the other week got me thinking about it again. Does anyone know whether this is feasible or if anyone is working on something in this vein?