Hi, You mentioned above that BigQuery *reduces* cost. I am surprised by that ass...

chasers · on April 11, 2023

> Can you point out ways in which Logflare uses it that makes it so (for ex, is it tiered-storage with a BQ front-end)?

After 3 months BigQuery storage ends up being about half the cost of object storage if you use partitioned tables and don't edit the data.

> How does Logflare's approach contrast with other entrants like axiom.co/99 who are leveraging blob stores (Cloudflare R2) for storage and serverless for querying for lower costs?

Haven't really looked at their arch but BigQuery kind of does that for us.

> Multiple pluggable storage/query backends (like Clickhouse) is all good, but is there a default that Logflare is going to recommend / settle on?

tbd

> Are there plans to go beyond just APM with Logflare (like metrics and traces, for instance)?

Yes. You can send any JSON payload to Logflare and it will simply handle it. Official open telemetry support is coming, but it should just work if your library can send it over as JSON. And you can send it metrics.

> I guess, at some level, this product signals a move away from Postgres-for-everything stance?

Postgres will last you a very long time usually but at some point with lots of this kind of data you'll really want to use an OLAP store.

With Supabase Wrappers you'll be able to easily access your analytics store from Postgres.

https://supabase.com/blog/postgres-foreign-data-wrappers-rus...

yencabulator · on April 13, 2023

> After 3 months BigQuery storage ends up being about half the cost of object storage if you use partitioned tables and don't edit the data.

It sounds like it's just standard GCS nearline pricing plus BigQuery margin fee. Raw nearline is cheaper to store, but has a per-GB access fee that'll hurt if you re-process old logs.

Interestingly, BigQuery's streaming read free tier of 300 TB/mo makes BigQuery a fun hack to store your old-but-read-more data into, even if it's e.g. backups blobs.

lyziinc · on April 11, 2023

Great questions.

BigQuery reduces storage costs as even before their recent pricing change, the cost per GB [0] is on par with and slightly lower than s3 storage costs [0], which we can use as a estimated market price for data storage. BigQuery makes money off the querying costs, which we take steps to optimize and minimize, with table partitions, caching, and ui design, both client side and server side. These all help to reduce querying costs. Table partitioning also helps to cut per-GB storage costs by half by switching to long-term logical storage after 90 days.

Of course, using blog storage might possibly result in comparable cost, however, relying on blob storage would likely increase the querying costs (in terms of GET requests) as well as complexity to query across multiple stored objects/buckets, as opposed to relying on BQ to handle the querying.

In the long term, we would likely continue using BQ for our platform infrastructure, unless GCP changes their pricing in a way that adversely affects us. When it comes to self-hosting, it would of course depend on how much complexity one would like to take on, and out-sourcing the storage and querying management is a better option in most cases.

We would not rule out such features, but we consider them nice-to-have features and are very far down the priority list. At the moment we're mostly focused on improving integration with the Supabase products and platform. It is actually possible to log stack traces, and is supported out-of-the-box for certain integrations such as the Logflare Logger Backend [2].

Postgres without any extensions is not optimized for columnar storage, and would not be an optimal experience for such large scale data storage and querying. It is also not advisable to use the same production datastore for both your application and your observability, it is better to keep them separate to avoid coupling. If one really wants to use the same Postgres server for everything, there are extensions that allow for Postgres to work as a columnar store, such as citus[3] and timescaledb[4], and we have not ruled out supporting these extensions as Logflare backends.

[0]: https://cloud.google.com/bigquery/pricing#storage

[1]: https://aws.amazon.com/s3/pricing/

[2]: https://github.com/Logflare/logflare_logger_backend

[3]: https://github.com/citusdata/citus

[4]: https://github.com/timescale/timescaledb

edit: Ah looks like Chase answered as well :)

ignoramous · on April 11, 2023

Yeah, thanks to you both! (: