It is much harder to enforce the discipline of those practices across modules bo...

disgruntledphd2 · on Feb 19, 2021

But the network is so slooooooowwwwwwwww.

Don't get me wrong, sometimes it's worth it (I particularly like Spark's facilities for distributed statistical modelling), but I really don't get (and have never gotten) why you would want to inflict that pain upon yourself if you don't have to.

5600k · on Feb 19, 2021

This. One million times this.

I’ve been developing for more years than dime if you have lived, and the best thing I’ve heard in years was that Google interviews were requiring developers to understand the overhead of requests.

In addition, they should require understanding of design complexity of asynchronous queues, needing and suffering from management overhead of dead letter, scaling by sharding queues if it makes more sense vs decentralizing and having to have non-transactionality unless it’s absolutely needed.

But not just Google- everyone. Thanks, Mr. Fowler for bringing this into the open.

Yeroc · on Feb 19, 2021

Indeed! The "Latency Numbers Every Programmer Should Know" page from Peter Norvig builds a helpful intuition from a performance perspective but of course there's a lot larger cost in terms of complexity as well.

bulleyashah · on Feb 19, 2021

I mean, you can always deploy your microsevices on the same host, it would just be a service mesh.

Adding network is not a limitation. And frankly, I don't understand why you say things like understanding network. Like reliability is taken care of, routing is taken care of. The remaining problems of unboundedness and causal ordering are taken care of (by various frameworks and protocols).

For dlq management, you can simply use a persistent dead letter queue. I mean it's a good thing to have dlq because failures will always happen. About which order to procese queue etc. These are trivial questions.

You say things as if you have been doing software development for ages, but you're missing out on some very simple things.

danielheath · on Feb 19, 2021

That’s because those things are only simple when they’re working.

Each of them introduces a new, rare failure mode; because there are now many rare failure modes, you have frequent-yet-inexplicable errors.

Dead letter queues back up and run out of space; network splits still happen;

bulleyashah · on Feb 19, 2021

Sounds like you're saying "Don't do distributed work" if possible (considering tradeoffs of course, but I guess people just don't even consider this option is your contention).

And secondly, if you do end up with q distributed systems, remember how many independently failing components there are because thag directly translates to complexity.

On both these counts I agree. Microservices is no silver bullet. Network partitions and failure happen almost every day where I work. But most people are not dealing with that level of problems, partly because of cloud providers.

Same kind of problems will be found on a single machine also. Like you'd need some sort of write ahead log, checkpointing, maybe optimize your kernel for faster boot up, heap size and gc rate.

All of these problems do happen, but most people don't need to think about it.

LargeWu · on Feb 19, 2021

I'm not reading this as "Don't do distributed work". It's "distributed systems have nontrivial hidden costs". Sure, monoliths are often synonymous with single points of failure. In theory, distributed systems are built to mitigate this. But unfortunately, in reality, distributed systems often introduce many additional single points of failure, because building resilient systems takes extra effort, effort that oftentimes is a secondary priority to "just ship it".

spion · on Feb 19, 2021

If you are using a database, a server and clients, you already have a distributed system.

You'll also likely use multiple databases (caching in e. g. Redis) and a job queue for longer tasks.

You'll also probably already have multiple instances talking to the databases, as well as multiple workers processing jobs.

Pretending that the monolith is a single thing is sneakily misleading. It's already a distributed system

fendy3002 · on Feb 20, 2021

Indeed. So with monolith usually we already have 3-4 (or more) somewhat reliable systems, and one non-reliable system which is your monolithic app. Why add other non reliable systems if you don't really need it?

Making a system to be reliable is really really hard and take many resources, which seldom companies pursuit.

spion · on Feb 20, 2021

just because all the code is in one place doesn't mean its one system.

TickleSteve · on Feb 19, 2021

Compare the overheads and failure modes of a request to that of a method call, thats the comparison.

Requests can fail in a host of ways that a call simply cannot, the complexity is massively greater than a method call.

bulleyashah · on Feb 19, 2021

+1

I realized this one day when I was drawing some nice sequence diagrams and presenting it to a senior and he said "But who's ensuring the sequence?". You'll never ask this question in a single threaded system.

Having said that, these things are unavoidable. The expectations from a system are too great to not have distributed systems in picture.

Monoliths are so hard to deploy. It's even more problematic when you have code optimized for both sync cpu intensive stuff and async io in the same service. Figuring out the optimal fleet size is also harder.

I'd love to hear some ways to address this issue and also not to have microservice bloat.

lantastic · on Feb 19, 2021

I heartily agree. Don't treat yourself like a 3rd party.

markburns · on Feb 19, 2021

What you tend to see, is the difficulty of crossing these network boundaries (which happens anyway) _and_ all the coupling and complexity of a distributed ball of mud.

Getting engineers who don't intuitively understand or maybe even care how to avoid coupling in monoliths to work on a distributed application can result in all the same class of problems plus chains of network calls and all the extra overhead you should be avoiding.

It seems like you tell people to respect the boundaries, and if that fails you can make the wall difficult to climb. The group of people that respect the boundaries whether virtual or not, will continue to respect the boundaries. The others will spend huge amounts of effort and energy getting really good at finding gaps in the wall and/or really good at climbing.

bosswipe · on Feb 19, 2021

If you're using microservice to enforce stronger interface boundaries what you're really relying on are separate git repos with separate access control to make it difficult to refactor across codebases. A much simpler way to achieve that same benefit is to create libraries developed in separate repos.

wg0 · on Feb 19, 2021

Wow. That's an interesting take on the problem of scaling a monolith without introducing the network between the system interactions.

marcosdumay · on Feb 19, 2021

Hum... People have been doing it since the 60's, when it became usual to mail tapes around.

If you take a look at dependency management at open source software, you'll see a mostly unified procedure, that scales to an "entirety of mankind" sized team without working too badly for single developers, so it can handle your team size too.

That problem that the "microservices bring better architectures" people are trying to solve isn't open by any measure. It was patently solved, decades ago, with stuff that work much better than microservices, in a way that is known to a large chunk of the developers and openly published all over the internet for anybody that wants to read about.

Microservices still have their use. It's just that "it makes people write better code" isn't true.

enigmo · on Feb 19, 2021

it's also possible to have more than one microservice in a single repo. defining good interfaces is a problem unsolved by repo size and count.

wiremine · on Feb 19, 2021

> So in theory yes you could have a well modularized monolith.

I've often wondered if this is a pattern sitting underneath our noses. I.e., Starting with a monolith with strong boundaries, and giving architects/developers a way to more gracefully break apart the monolith. Today it feels very manual, but it doesn't need to be.

What if we had frameworks that more gracefully scaled from monoliths to distributed systems? If we baked something like GRPC into the system from the beginning, we could more gracefully break the monolith apart. And the "seams" would be more apparent inside the monolith because the GRPC-style calls would be explicit.

(Please don't get too hung up on GRPC, I'm thinking it could be any number of methods; it's more about the pattern than the tooling).

The advantages to this style would be:

* Seeing the explicit boundaries, or potential boundaries, sooner.

* Faster refactoring: it's MUCH easier to refactor a monolith than refactor a distributed architecture.

* Simulating network overhead. For production, the intra-boundary calls would just feel like function calls, but in a develop or testing environment, could you simulate network conditions: lag, failures, etc.

I'm wondering if anything like this exists today?

jblwps · on Feb 19, 2021

Aren't you just describing traditional RPC calls? Many tools for this: DBus on Linux, Microsoft RCP on Windows, and more that I'm not aware of.

If you've only got a basic IPC system (say, Unix domain sockets), then you could stream a standard seriaization format across them (MessagePack, Protobuf, etc.).

To your idea of gracefully moving to network-distributed system: If nothing else, couldn't you just actually start with gRPC and connect to localhost?

Is there something I'm missing?

pjmlp · on Feb 20, 2021

You are missing the network.

When you start with gRPC and connect to localhost, usually the worst that can happen with a RPC call is that the process crashes, and your RPC call eventually times out.

But other than that everything else seems to work as a local function call.

Now when you move the server into another computer, maybe it didn't crash, it was just a network hiccup and now you are getting a message back that the calling process is no longer waiting, or you do two asynchronous calls, but due to the network latency and packet distribution, they get processed out of order.

Or eventually one server is not enough for the load, and you decide to add another one, so you get some kind of load mechanism in place, but also need to take care for unprocessed messages that one of the nodes took responsibility over, and so forth.

There is a reason why there are so many CS books and papers on distributed systems.

Using them as mitigation for teams that don't understand how to write modular code, only escalates the problem, you move from spaghetti calls in process, to spaghetti RPC calls and having to handle network failures in the process.

ChiefOBrien · on Feb 19, 2021

If you develop with COM in mind, then it doesn't matter whether the object is in-process or somewhere on the network.

pjmlp · on Feb 20, 2021

That is the selling theme of DCOM and MTS, CORBA and many other IPC mechanisms, until there is a network failure of some sort that the runtime alone cannot mitigate.

hootbootscoot · on Feb 19, 2021

Well, we often use plain old HTTP for this purpose, hence the plethora of 3rd party API's one can make an HTTP call to...

(I side with the monolith, FWIW...I love Carl Hewitt's work and all, it just brings in a whole set of stuff a single actor doesn't need... I loved the comment on binding and RPC above, also the one in which an RPC call's failure modes were compared to the (smaller profile) method call's)

koolba · on Feb 19, 2021

Sure but the converses of these statements are advantages for monoliths.

Most languages provide a way to separate the declaration of an interface from its implementation. And a common language makes it much easier to ensure that changes that break that interface are caught at compile time.

collyw · on Feb 19, 2021

I have worked on some horrendously coupled micro-service architectures.

deterministic · on Feb 22, 2021

Not at all. That’s what libraries are for. Most applications already use lots of libraries to get anything done. Just use the same idea for your own organisations code.