The GitHub repo README gives a better sense of the capabilities of the pgstream ...

eminano · 2024-09-02T09:44:42.000000Z

That's a great summary!

You're right, at scale, Kafka allows you to parallelise the workload more efficiently, as well as supporting multiple consumers for your replication slot. However, you should still be able to use the plug and play output processors with Kafka just as easily (implementation internally is abstracted, it works the same way with/without Kafka). We currently use the search indexer with Kafka at Xata for example. As long as the webhook server has support for rate limiting, it should be possible to handle a large amount of event notifications.

Regarding the transaction metadata, the current implementation uses the wal2json postgres output plugin (https://github.com/eulerto/wal2json), and we don't include the transactional metadata. However, this is something that would be easy to enable, and integrate into the pgstream pipeline if needed, to ensure transactional consistency on the consumer side.

Thanks for your feedback!