Hacker News new | past | comments | ask | show | jobs | submit login

I'm Ryan Worl, co-founder and CTO of WarpStream. We're super excited to announce our Developer Preview of our Kafka protocol compatible streaming system built directly on top of S3 with no stateful disks/nodes to run, no rebalancing data, no ZooKeeper, and 5-10x cheaper because of no cross-AZ bandwidth charges.

If you have any questions about WarpStream, my co-founder (richieartoul) and I will be here to answer them.




Congrats! "The SQLite of Kafka" is an item from my side projects pile I'm happy to delete.

One reason I never built it is because it felt paradoxical that users might want a scaled down Kafka rather than using SQLite directly if the scale didn't matter. But you may find out that people enjoy the semantics of the Kafka protocol or are already using Kafka and have learned they don't have the scale they thought they did to warrant the complexity. Best of luck!


> it felt paradoxical that users might want a scaled down Kafka rather than using SQLite directly if the scale didn't matter.

I don't need to push very many messages (not enough to justify running Kafka), but each of the messages that I do push are both 1. very important and must be cross-AZ durable, and 2. very urgent and must not be blocked by e.g. contended writes in a regular RDBMS.

Currently, the winner of this use-case for IaaS customers is "whatever cloud-native message-queue service your IaaS offers." (And those customers would also be the extent of WarpStream's Total Addressable Market here, given that WarpStream's architecture fundamentally relies on having a highly-replicated managed object store available.)

I'm therefore curious: in what ways does WarpStream win vs. Amazon SQS / Google Cloud Pub/Sub / Azure Queue Storage?


I can't speak to GCP or Azure but the semantics of a log offer replayability whereas SQS does not.


Does it support S3-compatible services, notably Cloudflare R2? I heard that there might be special handling for each S3-compatible providers, due the slightly different API behavior and different consistency models, and etc.

If it support Cloudflare R2 then it would be great for multi-cloud too.


The blog post mentions that partitions are too low-level an abstraction to program against. Does that mean WarpStream doesn't use partitions?

Do you provide any ordering guarantees like Kafka does at the partition level?


(WarpStream founder) No WarpStream has partitions internally and provides the same ordering guarantees Kafka does at the partition level. We're just saying that we think for most streaming applications this is not a great programming model, and we think there is an opportunity to do something better (but we haven't done that yet).


Do you still provide low-level control over partition subscriptions and offset management? Any plan to support Kafka transactions?

That's all required to build exactly-once systems on top of Kafka (like the stateful stream processing engine I work on) even if it's not the easiest interface for normal application-level development.


[WarpStream CTO here]

WarpStream is Kafka protocol compatible, so we do support topic-partitions and consumer groups. We do not expose support for transactions or idempotent producing today, but the internals of the system support that and we will probably work on the idempotent producer sometime in the next month, with transactions coming shortly after, depending on demand from the Developer Preview users.


1. dont producers now have much higher latency since they have to wait for writes to s3.

2. if the '5-10x cheaper' is mostly due to cross AZ savings, isnt that offered by AWS MSK offering too?


(WarpStream founder)

1. Yeah, we mention at the end of the post the P99 produce latency is ~400ms. 2. MSK still charges you for networking to produce into the cluster and consumer out of it if follower fetch is not properly configured. Also, you still have to more or less manage a Kafka cluster (hot spotting, partition rebalancing, etc). In practice we think WarpStream will be much cheaper to use than MSK for almost all use-cases, and significantly easier to manage.


How does the cost compare if follower fetch is properly configured?


1. what payload size and flush interval is that latency measured against?


1. By payload size do you mean record size? They're ~1KiB 2. Flush interval was 100ms, the agent defaults to 250ms though I believe.


How do you replace ZooKeeper?


Kafka replaced ZooKeeper with Kafka itself already a few years ago https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A...


And first announced to be "production ready" in October 2022.


(WarpStream founder) WarpStream has a completely different architecture than Kafka: https://docs.warpstream.com/warpstream/background-informatio...

That said it does require a lot of metadata to orchestrate all the different concurrent operations over S3. We handle this with a custom metadata store that we run in our cloud control plane.


What was used to generate the diagrams in the post?


Do you have a reference documentation for S3 data layout?


(WarpStream founder here) Not currently. One of the things we're looking to do next is make it so any topic can be "automatically" turned into a standard format in S3, something like Parquet/Iceberg/Deltalake so its easier to consume for application that don't particularly care about the Kafka protocol.


That would be a killer feature. Once we switched to MSK back at a previous gig, our biggest care and feeding task was adding new event types to our kafka-to-redshift ETL thing. Well, that and dealing with scaling our http wrapper...


That would be awesome!


Kafka itself no longer requires zookeeper.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: