(WarpStream co-founder here) We're not talking about no disks as in no storage, ...

solatic · 2024-04-29T15:57:58 1714406278

> This does have a latency trade-off

There are no silver bullets. Traditional S3, with the durability guarantees that S3 provides, has a latency trade-off because the data needs to be copied to additional availability zones before acknowledging the write. Once you collapse everything to a single availability zone (i.e. S3 Express One Zone), you have little reason not to use Kafka, which scales costs within a single AZ without a problem. At $0.16/GB, S3EOZ is about 7x more expensive than normal S3 ($0.023/GB) for fewer copies of the data/lower integrity guarantees, or about 60% more expensive than MSK or Kinesis Data Streams ($0.10/GB). If you write to a quorum of S3EOZ, then you're tripling your S3EOZ storage costs, to 0.16 * 3 = $0.48/GB. And this doesn't include the cost of compute!

Where's the value above just running Kafka within a single AZ, with no latency trade-off?

richieartoul · 2024-04-29T16:01:02 1714406462

(WarpStream co-founder here)

You don't have to keep the data stored in S3 express one zone forever, you can just land it there and then immediately compact it to S3 standard. You still pay the higher fee to write to S3EOZ, but not the higher storage fee.

WarpStream does this, data gets compacted out within seconds usually. Of course this is now... tiered storage. But implemented over two "infinitely scalable" remote storage systems so it gets rid of all the operational and scaling problems you have with a typical tiered storage Kafka setup that uses local volumes as the landing zone.

solatic · 2024-04-29T16:11:06 1714407066

> so it gets rid of all the operational and scaling problems you have with a typical tiered storage Kafka setup

Do these operational and scaling problems include AWS's managed services? MSK, Kinesis Data Streams?

At small scale, why wouldn't someone go with one of those? And at large scale, where's the Total Cost of Ownership comparison to show that it's worth it to ditch Kafka's local disks for a model built on object storage?

richieartoul · 2024-04-29T16:19:32 1714407572

Short answer is that MSK has (almost) all the same problems as OSS Kafka. Kinesis streams is a different beast that would require its own blog post.

RE:numbers: https://www.warpstream.com/blog/warpstream-benchmarks-and-tc...

richieartoul · 2024-04-29T16:05:30 1714406730

I talk about this more here: https://www.warpstream.com/blog/s3-express-is-all-you-need

RE: comparing to a single-zone Kafka cluster. A lot of people really dislike operating Kafka. Some people don't mind it and that's cool too, but its not the majority in my experience.

chenyang · 2024-04-30T11:10:02 1714475402

In addition to the high cost of S3Express, utilizing warpstream to write three replicas to S3Express and later compacting them to S3Standard could result in quadruple network/outbound traffic costs. With two consumer groups involved, this could increase to six times the network/outbound traffic.

Considering a c5.4xlarge instance with 16 cores and 32GB of memory, which offers a baseline bandwidth of only 5Gib, it's limited to a maximum production throughput of 100MiB/s.

Therefore, I have reservations about the cost-effectiveness of your low-latency solution, given these potential expenses.

wokwokwok · 2024-04-29T15:52:28 1714405948

I guess we’ll have to wait for a full write up of this, but it does seem like having multiple categories of object storage is pulls off hood tiered storage!

…rebranded with a different name, again.

Again complex, again no obvious way to query storage directly, again unclear performance characteristics, again no obvious reason to see why the networking costs make saving from it largely meaningless.

You have to admit it’s a bit of a hard sell without any comeback after literally just saying that people were just inventing new names for minor variations on tiered storage…

wanshao · 2024-04-30T11:16:59 1714475819

I agree with your viewpoint. The crux of the matter is not whether to use tiered storage or not, but what trade-offs have been made in the specific storage architecture and what benefits have been gained. Here(https://github.com/AutoMQ/automq?tab=readme-ov-file#-automq-...) is a qualitative comparison chart of streaming systems including kafka/confluent/redpanda/warpstream/automq. This comparison chart does not have specific numerical comparisons, but purely based on their trade-offs at the storage level, I think this will be of some use to you.

ryanworl · 2024-04-29T16:08:11 1714406891

We're still drafting our next post in this series, but the answer is actually very simple: two tiers of object storage do not have the same drawbacks as a combination of object storage and local disk. We wanted to explain that in this post too, but it would've been unreasonably long.

We've designed WarpStream to work extremely well on the slower, harder-to-use one first, and that is how 95+% of our workloads run in production. The tiered storage solutions from other streaming vendors do the opposite, where they were first designed for local SSDs and then bolted on object storage later.

The equivalent would be if we were pitching our support for an even slower, cheaper tier of object storage like AWS S3 Glacier.