Hacker News new | past | comments | ask | show | jobs | submit login

> This does have a latency trade-off

There are no silver bullets. Traditional S3, with the durability guarantees that S3 provides, has a latency trade-off because the data needs to be copied to additional availability zones before acknowledging the write. Once you collapse everything to a single availability zone (i.e. S3 Express One Zone), you have little reason not to use Kafka, which scales costs within a single AZ without a problem. At $0.16/GB, S3EOZ is about 7x more expensive than normal S3 ($0.023/GB) for fewer copies of the data/lower integrity guarantees, or about 60% more expensive than MSK or Kinesis Data Streams ($0.10/GB). If you write to a quorum of S3EOZ, then you're tripling your S3EOZ storage costs, to 0.16 * 3 = $0.48/GB. And this doesn't include the cost of compute!

Where's the value above just running Kafka within a single AZ, with no latency trade-off?




(WarpStream co-founder here)

You don't have to keep the data stored in S3 express one zone forever, you can just land it there and then immediately compact it to S3 standard. You still pay the higher fee to write to S3EOZ, but not the higher storage fee.

WarpStream does this, data gets compacted out within seconds usually. Of course this is now... tiered storage. But implemented over two "infinitely scalable" remote storage systems so it gets rid of all the operational and scaling problems you have with a typical tiered storage Kafka setup that uses local volumes as the landing zone.


> so it gets rid of all the operational and scaling problems you have with a typical tiered storage Kafka setup

Do these operational and scaling problems include AWS's managed services? MSK, Kinesis Data Streams?

At small scale, why wouldn't someone go with one of those? And at large scale, where's the Total Cost of Ownership comparison to show that it's worth it to ditch Kafka's local disks for a model built on object storage?


Short answer is that MSK has (almost) all the same problems as OSS Kafka. Kinesis streams is a different beast that would require its own blog post.

RE:numbers: https://www.warpstream.com/blog/warpstream-benchmarks-and-tc...


I talk about this more here: https://www.warpstream.com/blog/s3-express-is-all-you-need

RE: comparing to a single-zone Kafka cluster. A lot of people really dislike operating Kafka. Some people don't mind it and that's cool too, but its not the majority in my experience.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: