Apache Flink is also a good alternative, and works very well. We have used it in production for a while for generating live reports. I made simple example [1] and have a look at the docs if you are more interested [2]. Gonna definetely try Kafka's version, its version of stream processing [3] also interesting as well.
Here's a summary:
- KSQL has a completely Interactive SQL interface, so you don't have to switch between DSL code and SQL.
- KSQL upports local, distributed and embedded modes. Is tightly integrated with Kafka's Streams API and Kafka itself; doesn't reinvent the wheel. So is simple to use and deploy.
- KSQL doesn't have external dependencies, for orchestration, deployment etc.
- KSQL has native support for Kafka's exactly once processing semantics, supports and stream-table joins.
While Flink has in fact no direct SQL entry point right now (and many users simply wrap the API entry points themselves to form a SQL entry point), the other statements are actually not quite right.
- Flink as a whole (and SQL sits just on the DataStream API) works local, distributed and embedded as well.
- Flink does not have any external dependencies, not even Kafka/ZooKeeper; it is self-contained. One can even just receive a data stream via a socket if that works for the use case.
- Flink itself has always had exactly-once semantics, and works also exactly-once with Kafka.
@neha - where do you think kafka is going to evolve in the world of data processing.
I'm very bullish on kafka. Today we have Spark for batch data computation and have already switched some of our streaming stuff to Kafka.
Do you see yourselves entering into the batch processing space anytime ? Google has officially said that Flink is "compelling" because of its compatibility with the Beam model.
If I can step on thin ice... is it easier for Flink to commandeer Kafka or for Kafka to win over batch processing ?
[1] https://medium.com/@mustafaakin/flink-streaming-sql-example-...
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.3/...
[3] http://docs.confluent.io/current/streams/index.html