Don't get me wrong, sometimes it's worth it (I particularly like Spark's facilities for distributed statistical modelling), but I really don't get (and have never gotten) why you would want to inflict that pain upon yourself if you don't have to.
I’ve been developing for more years than dime if you have lived, and the best thing I’ve heard in years was that Google interviews were requiring developers to understand the overhead of requests.
In addition, they should require understanding of design complexity of asynchronous queues, needing and suffering from management overhead of dead letter, scaling by sharding queues if it makes more sense vs decentralizing and having to have non-transactionality unless it’s absolutely needed.
But not just Google- everyone. Thanks, Mr. Fowler for bringing this into the open.
Indeed! The "Latency Numbers Every Programmer Should Know" page from Peter Norvig builds a helpful intuition from a performance perspective but of course there's a lot larger cost in terms of complexity as well.
I mean, you can always deploy your microsevices on the same host, it would just be a service mesh.
Adding network is not a limitation. And frankly, I don't understand why you say things like understanding network. Like reliability is taken care of, routing is taken care of. The remaining problems of unboundedness and causal ordering are taken care of (by various frameworks and protocols).
For dlq management, you can simply use a persistent dead letter queue. I mean it's a good thing to have dlq because failures will always happen. About which order to procese queue etc. These are trivial questions.
You say things as if you have been doing software development for ages, but you're missing out on some very simple things.
Sounds like you're saying "Don't do distributed work" if possible (considering tradeoffs of course, but I guess people just don't even consider this option is your contention).
And secondly, if you do end up with q distributed systems, remember how many independently failing components there are because thag directly translates to complexity.
On both these counts I agree. Microservices is no silver bullet. Network partitions and failure happen almost every day where I work. But most people are not dealing with that level of problems, partly because of cloud providers.
Same kind of problems will be found on a single machine also. Like you'd need some sort of write ahead log, checkpointing, maybe optimize your kernel for faster boot up, heap size and gc rate.
All of these problems do happen, but most people don't need to think about it.
I'm not reading this as "Don't do distributed work". It's "distributed systems have nontrivial hidden costs". Sure, monoliths are often synonymous with single points of failure. In theory, distributed systems are built to mitigate this. But unfortunately, in reality, distributed systems often introduce many additional single points of failure, because building resilient systems takes extra effort, effort that oftentimes is a secondary priority to "just ship it".
Indeed. So with monolith usually we already have 3-4 (or more) somewhat reliable systems, and one non-reliable system which is your monolithic app. Why add other non reliable systems if you don't really need it?
Making a system to be reliable is really really hard and take many resources, which seldom companies pursuit.
I realized this one day when I was drawing some nice sequence diagrams and presenting it to a senior and he said "But who's ensuring the sequence?". You'll never ask this question in a single threaded system.
Having said that, these things are unavoidable. The expectations from a system are too great to not have distributed systems in picture.
Monoliths are so hard to deploy. It's even more problematic when you have code optimized for both sync cpu intensive stuff and async io in the same service. Figuring out the optimal fleet size is also harder.
I'd love to hear some ways to address this issue and also not to have microservice bloat.
Don't get me wrong, sometimes it's worth it (I particularly like Spark's facilities for distributed statistical modelling), but I really don't get (and have never gotten) why you would want to inflict that pain upon yourself if you don't have to.