After reading quite a few books and blog posts on event-driven architectures and comparing the suggested patterns with what I've seen myself in action, I keep wondering:
Is there any company out there that has fully embraced this type of architecture when it comes to microservice communication, handling breaking schema changes or failures in an elegant way, and keeping engineers and other data consumers happy enough?
Every event-driven architectural pattern I've read about can quite easily fall apart and I have yet to find satisfying answers on what to do when things go south. As a trivial example, everybody talks about dead-letter queues but nobody really explains how to handle messages that end up in one.
Is there any non-sales community of professionals discussing this topic?
Any help would be much appreciated.
Windows 95. Old style gui programming meant sitting in a loop, waiting for the next event, then handling it. You type a letter, there's a case switch, and the next character is rendered on the screen. Being able to copy a file and type at the same time was a big deal. You'd experience the dead letter queue when you moved a window while the OS was handling a device, and the window would sort of smear across the screen when the repaint events were dropped.
Concurrent programming is hard. State isolation from micro services helps a lot. but eventually you'll need to share state, and people try stuff like `add 1 to x`, but that has bugs, so they say, `if x == 7 add 1 to x` but that has bugs so they say, `my vector clock looks like foo. if your vector clock matches, add 1 to x, and give me back your updated vector clock` but now you've imposed a total order and have given up a lot of performance.
I'm blind to the actual problem you're facing. My default recommendation is to have a monorepo, and break out helpers for expensive tasks, no different than spinning up a thread or process on a big host. Have a plan for building pipelines a->b->c->d. also have a plan for fan out a->b & a->c & a->d
It has been widely observed there are no silver bullets. but there are regular bullets. Careful and thoughtful use can be a huge win. If you're in that exponential growth phase, it's duck tape and baling wire all the way, get used to everything being broken all the time. if you're not, take your time and plan out a few steps ahead. Operationalize a few micro services. Get comfortable with the coordination and monitoring. Learn to recover gracefully, and hopefully find a way to dodge that problem next time around.
Sorry this is hand wavy. I don't think you're missing anything. it's just hard. if you're stuck because it won't fit on 1X anymore, you've got to find a way to spread out the load.