Related: Meilisearch v1.0.0 release two days ago: https://news.ycombinator.com/i...

arein3 · on Feb 10, 2023

Regarding performance, hope it's not the same as Graphana's Loki.

Grapana Loki advertises lower resource requirement, but it's just a disk storage system. Any query will read everyrhing from disk.

The Elasticsearch has big RAM requirements if you create a lot of indexes of course. You can't have something more quick than indexes, and you can't have lower resource requirements without having fewer indexes.

drowsspa · on Feb 10, 2023

What do you mean by "any query will read everything from disk"? Is that when you do text search or even when you lookup by labels Prometheus-style?

arein3 · on Feb 10, 2023

Tags in loki are things like host, application, and environment. When searching by those tags and a time interval, it will read everything from disk. So any query that filters by ex. SessionId or a keyword from the log like Exception will read all the logs from disk. This can take ages if you have a lot of logs and a big time frame. Compare that with Elasticsearch which can index anyrhing, like SessionId/log message and return the result in an instant, without even reading the disk.

drowsspa · on Feb 11, 2023

> When searching by those tags and a time interval, it will read everything from disk

That's what I'm asking, actually. Isn't Loki's proposition that it only indexes the tags and time interval? Do you mean that even filtering by that there's still a lot of data to go through?

Because it seems like you're saying it always fetches everything from disk.

arein3 · on Feb 11, 2023

> Isn't Loki's proposition that it only indexes the tags and time interval? Do you mean that even filtering by that there's still a lot of data to go through?

Yes

> Because it seems like you're saying it always fetches everything from disk.

If you specify a tag, like environment, it will not read the disk for data from other environments. But the tags like environment/host/timeframe are not enough if you want to query for something like error/exception/sessionid, and you might have to wait minutes/hours for a query which covers a lot of data.

mardix · on Feb 10, 2023

Loving it. I'm interested in milli-py.

What can be a cool feature, it's auto backup to S3, or load from S3.

canadiantim · on Feb 10, 2023

That looks awesome, kudos! I've been looking for a way to do local-first high-quality FTS.

m3affan · on Feb 10, 2023

I wonder how lasting will the support be for such libraries

ollybee · on Feb 10, 2023

Give Xapian a go also.

remram · on Feb 10, 2023

Xapian is a library but is licensed under GPL, so you can't build on it without making your whole app GPL.

You can get around that by having the search happen in a separate process or something, maybe. But this is a huge issue for something that one might want to embed.

kat_rebelo · on Feb 10, 2023

that is a slight misunderstanding of how open source licensing works.

the GPL bleed only happens if you distribute your application, meaning to sell or give away binary packages for customers to install. if your product is a hosted api that you do not distribute, you do not invoke that clause.

also, a lot of open source projects handle this by having things like the core engine licensed on a copy-left friendly license (GPL,AGPL). however, the language connectors and bindings are licensed under the slightly less restrictive apache license. unless you are offering a saas service of the product itself, it is more likely you are actually interacting with the connectors anyways. mongodb is a classic example of this model.

remram · on Feb 11, 2023

Yes, if you don't distribute it, the license doesn't matter. That is more a flaw than the intention, but that is correct.

The point is that I can use SQLite, Tantivy, RocksDB, ... in my app no problem. I can make it open core, I can make it AGPL, BSD, MIT, not problem. Because those things are meant to be embedded. But I almost definitely can't use Xapian.

Let's be honest, if I want a search solution for use in my SaaS, I will grab Elasticsearch or an equivalent, I have no need for a library. It seems to me that the only use case where Xapian could really shine is crippled by their license. That is a shame.

AlexAltea · on Feb 10, 2023

It can be a problem if you intended to make an embeddable search engine within applications meant to be executed by your end-users (as is the case with milli-py above).

antman · on Feb 10, 2023

github 404 fyi

AlexAltea · on Feb 10, 2023

Fixed, apologies! I had finished the PoC last night but the repo was still marked as private.