There are no silver bullets. Traditional S3, with the durability guarantees that S3 provides, has a latency trade-off because the data needs to be copied to additional availability zones before acknowledging the write. Once you collapse everything to a single availability zone (i.e. S3 Express One Zone), you have little reason not to use Kafka, which scales costs within a single AZ without a problem. At $0.16/GB, S3EOZ is about 7x more expensive than normal S3 ($0.023/GB) for fewer copies of the data/lower integrity guarantees, or about 60% more expensive than MSK or Kinesis Data Streams ($0.10/GB). If you write to a quorum of S3EOZ, then you're tripling your S3EOZ storage costs, to 0.16 * 3 = $0.48/GB. And this doesn't include the cost of compute!
Where's the value above just running Kafka within a single AZ, with no latency trade-off?
You don't have to keep the data stored in S3 express one zone forever, you can just land it there and then immediately compact it to S3 standard. You still pay the higher fee to write to S3EOZ, but not the higher storage fee.
WarpStream does this, data gets compacted out within seconds usually. Of course this is now... tiered storage. But implemented over two "infinitely scalable" remote storage systems so it gets rid of all the operational and scaling problems you have with a typical tiered storage Kafka setup that uses local volumes as the landing zone.
> so it gets rid of all the operational and scaling problems you have with a typical tiered storage Kafka setup
Do these operational and scaling problems include AWS's managed services? MSK, Kinesis Data Streams?
At small scale, why wouldn't someone go with one of those? And at large scale, where's the Total Cost of Ownership comparison to show that it's worth it to ditch Kafka's local disks for a model built on object storage?
RE: comparing to a single-zone Kafka cluster. A lot of people really dislike operating Kafka. Some people don't mind it and that's cool too, but its not the majority in my experience.
There are no silver bullets. Traditional S3, with the durability guarantees that S3 provides, has a latency trade-off because the data needs to be copied to additional availability zones before acknowledging the write. Once you collapse everything to a single availability zone (i.e. S3 Express One Zone), you have little reason not to use Kafka, which scales costs within a single AZ without a problem. At $0.16/GB, S3EOZ is about 7x more expensive than normal S3 ($0.023/GB) for fewer copies of the data/lower integrity guarantees, or about 60% more expensive than MSK or Kinesis Data Streams ($0.10/GB). If you write to a quorum of S3EOZ, then you're tripling your S3EOZ storage costs, to 0.16 * 3 = $0.48/GB. And this doesn't include the cost of compute!
Where's the value above just running Kafka within a single AZ, with no latency trade-off?