Hacker News new | past | comments | ask | show | jobs | submit login
Give me back my monolith (2019) (craigkerstiens.com)
331 points by winslett on May 10, 2022 | hide | past | favorite | 296 comments



Microservices was always a solution to the organisational problem of getting more developers working on a system at once. A lot of people working on one code base over the top of each other often causes problems and issues with ownership. The solution was services independently developed with the trade off cost being increased complexity on a range of things.

There is an essay in the mythical man month about Conways law and microservices are very much about a way to use Conways law to achieve scale of development. You likely don't need microservice until you hit a scale where the monolith is a real pain point. You can probably cope with a monolith especially with some reasonable design effort up to the 100 developers maybe more with a lot of separation but at some point in the >25 developers range it becomes cheaper and easier to switch to multiple deployed units of software and deal with the increased system complexity and interface designs and the inherent versioning issues it causes.

It is easier to start with a monolith, find the right design and then split on the boundaries than it is to make the correct microservices to begin with.


Microservices was always a solution to the organisational problem of getting more developers working on a system at once.

It may be a solution, but there were already solutions for that. As far as I remember, the orignal argument of microservices was to allow different teams to use whatever languages/tools they choose, as long as they provided an API. As opposed to having IT, payroll, and sales all having to wait for one development team to make everything. Then with docker and containers it became that you could isolate deployments, and scale out just the parts that need it. Which is great if you need to minimize outages, and have some parts of your system that get major spikes in traffic, and you decide it's worth the extra complexity.


This is where micro-services come in for me. Except, for all intents and purposes I'm a one man development shop.

APIs.

I've deployed about 10 or micro-services for use cases where a simple API is required for my applications. Maybe it's literally querying a row in a table and returning the result in JSON format. Or maybe it's to generate a JWT to access Apple Maps.

Just craft it up in NodeJS, deploy to Lambda, and those little services haven't changed in years but they just keep cranking away.

We can have both monoliths and micro-services where appropriate.


At the risk of committing a "No True Scotsman" fallacy, what you're talking about is not really what people are talking about when they are discussing the tradeoffs of a microservices architecture.

While "microservice" obviously literally means a small service, when people talk about microservice architecture they're usually talking about building out applications that look like a single service to the end-user but in reality are a whole bunch of small services that talk to each other.

Just creating a small service that exposes a small piece of functionality is clearly the right thing to do in many cases, but that's not what the discussion about microservices architectures is really about.


I have a little website that I maintain personally and it consists of distinct services for auth, analytics, comments, and a fileserver for static HTML (also a bunch of cronjobs for backing up databases and so on). It runs on Kubernetes on a couple of Raspberry Pis in my office. These services are sort of distinct, but several collaborate with the auth service.

Some would say this is overkill and they're probably right (although I would argue that it's not overkill to the degree that most on this forum would initially believe), but I haven't had any real issues with the microservice architecture; however, I do like that each service has its own secrets--the comments service doesn't have access to the auth service's private key or database credentials so if it gets pwned the blast radius is more limited. Similarly, I will be able to create network policies to lock these services down so they can only talk to their collaborators (defense in depth). Maybe there are some analogues in the Java or .Net VMs for these isolation techniques, but I don't think there's any general analog.

That said, there is some amount of infrastructure overhead for each service (each service likely needs its own CD pipeline, DNS, letsencrypt certs, database, etc) but these are all managed via infrastructure as code so I can basically just stamp out these services with little effort.

I have plans for NFS, mastodon, and media services all of which are third party and thus it wouldn't be desirable to integrate them into a monolith, but I'll be able to take advantage of my infrastructure as code automation to stamp out these services (which is mostly to say that "monolith" doesn't really absolve you from needing to manage many services anyway).

In general, I've not had the sorts of problems that people complain about with respect to microservices with my personal project, nor with my work projects at various prior employers. I'm not sure what the difference is--maybe I've been graced with good architecture? That said, I have worked on monoliths that were pretty bad experiences, and I think a big part of this was that it was too easy for teams to thoughtlessly extend the interfaces between components which degrades the architecture over time (conversely, I posit that microservices make it more difficult to change interfaces and thus there needs to be a more compelling reason). This is just speculation based on my experiences; I don't intend for it to seem authoritative.


Yes. I like to call these “regular-size services”. Of course, it’s just SOA.


> Microservices was always a solution to the organisational problem of getting more developers working on a system at once.

I don't really agree. It's probably the most agreed upon benefit, but there are others. Fundamentally, microservices allow you to isolate state. I can take two services and split them up - and now I've split their state spaces. I can put a queue service between them and now I've sliced out the state of their communication.

This slicing up of state has a ton of benefits if done right.

1. Isolation of state is basically the key to having scalable concurrency. Watch the Whatsapp talk on Erlang and they say "Isolation, Isolation, Isolation".

2. Isolation of services is great for security. You can split up permissions across your services, limit their access in a more granular way, etc.

Those are pretty nice wins. They're achievable through discipline in a monolith (heavy use of module boundaries, heavy use of immutability - something most mainstream languages don't encourage) but a network boundary really forces these things and makes unintentional stateful coupling a lot more painful.

> It is easier to start with a monolith, find the right design and then split on the boundaries than it is to make the correct microservices to begin with.

I disagree. Starting with a bad set of microservices can be fixed by merging. Merging two codebases is trivial compared to splitting. Again, if I have isolation between the services, even if the slicing up was done badly, even if they're coupled, I can just remove the layer between them.

Splitting has to start from a place of coupling and then try to decouple - this is especially hard with languages that encourage encapsulated mutable state (most of them).


> I disagree. Starting with a bad set of microservices can be fixed by merging. Merging two codebases is trivial compared to splitting. Again, if I have isolation between the services, even if the slicing up was done badly, even if they're coupled, I can just remove the layer between them.

There is still coupling in microservices, it has just shifted to messaging, networking, and queuing. If you get any of those parts wrong, you have a worse mess to untangle with less mature debugging/logging tooling than a monolith enjoys, all the while likely dealing with eventual consistency (depending on the design). I'm not saying don't start with a microservice, but it likely wouldn't be the very first tool I would reach for when starting out if a monolith would do the job effectively. Most things will never be hyperscale and won't benefit from the increased concurrency. You can go a very long way with a "majestic monolith" and a bit of care.


> There is still coupling in microservices, it has just shifted to messaging, networking, and queuing.

Sure, in the sense that your service is "coupled" to a queue and if you don't abstract that away it's hard to change that queue implementation. But in the sense of two services you wrote being coupled, they aren't, in terms of shared state. That gets pulled out. There is no way for one service to mutate the memory of another - it has to send a message to it.

That can be TCP or it can be over some queue or stream or whatever.

> If you get any of those parts wrong, you have a worse mess to untangle with less mature debugging/logging tooling than a monolith enjoys

This is the case with any concurrent system. The fact that so many languages lack concurrency primitives is probably why people don't run into this more often. If you use concurrency primitives in your language, you already have this.

> all the while likely dealing with eventual consistency (depending on the design)

There's nothing eventually consistent about this system. It fundamentally has causal consistency (since messages from a service must come after messages to that service that triggered them), and it's perfectly capable of leveraging transactions.

> I'm not saying don't start with a microservice, but it likely wouldn't be the very first tool I would reach for when starting out if a monolith would do the job effectively.

To each their own. I much prefer it. It's far simpler to maintain "good" design since the network boundary creates a hard line in the sand that you physically can not violate.


They are coupled by the queue itself (you accounted for your queue going down and out of order/delayed messages right?), the network (i.e. what happens if some microservices go offline?), and most importantly the event message abstraction. Nothing is for free, and the event message abstraction/format is the new shared state in microservices. It's easy to get the event messaging abstraction wrong in green field projects, since you likely don't understand the domain as well as you would like. If that goes wrong, it can be very painful to fix after the fact. Again, not slamming microservices, but we should go in with eyes wide open about the well-known benefits vs. the tradeoffs they offer. I refer to the high quality (and partially free!) course [1] taught by Udi Dahan from Particular that reviews many of the tradeoffs with distributed system design.

> The fact that so many languages lack concurrency primitives is probably why people don't run into this more often. If you use concurrency primitives in your language, you already have this.

The difference is that with a monolith, the entire application state is in one place, but with microservices its state is distributed. This makes logging and debugging more difficult along several dimensions. Finally, there are decades worth of tooling development at your disposal to debug and monitor your monolith (even concurrency issues). The tooling around debugging and troubleshooting microservices pales in comparison.

[1] https://learn.particular.net/courses/distributed-systems-des...


> That gets pulled out. There is no way for one service to mutate the memory of another - it has to send a message to it.

If it’s well-designed, and that’s a big if. Most implementations I’ve worked with constantly mutate each others’s state.


No, it's physically impossible. You have causal consistency across services, not within services.

https://youtu.be/lKXe3HUG2l4?t=1438


Codebase A writes data to datastore. Codebase B mutates it. Codebase A loads it back in, assuming it’s still the same.

Boom. You’ve mutated codebase A’s memory.


I would hope it's obvious that you haven't mutated A's memory, but I'll just suggest you watch the talk.


How is that any different to calling a function on a class? That's technically not class A modifying class B's memory either. B modifies it's own memory in response to a message (function parameters) from A. The message going over a network doesn't make that fundamentally different.


Function parameters aren't messages. They're shared state. I'd suggest watching the talk and reading about message passing systems in general.


“Watch this famous video” is not a great response. Many of us watched it years ago and seem to have interpreted it rather differently.


Then you interpreted it incorrectly. I'm not inclined to teach you via HN about a subject that's well documented by resources I've already linked.


I suppose if you mutate them. But we have a linter in place and a CI system that enforces it that prevents that.


There are many solutions, certainly. A network is one option, which I personally prefer, but as I said elsewhere it's a "choose the right tool for the job" kind of situation.


For all intents and purposes, you’ve mutated the memory. Sure, you haven’t mutated by reaching directly to the RAM. But the effect is still the same.


I disagree. If you are able to merge them you spent the work to originally have them split. So more work to start with microservices. It goes back to agile, the easy solution is have a monolith and figure out later how it can be well-split.


> Fundamentally, microservices allow you to isolate state. I can take two services and split them up - and now I've split their state spaces. I can put a queue service between them and now I've sliced out the state of their communication.

Dr. Alan Kay would like a word. This is literally the premise behind OO:

> "OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things."—Dr. Alan Kay

Outside of "extreme late-binding," which is a fascinating topic in its own right, isolating state is exactly the point of OOP. If we need microservices to accomplish isolation of state, that suggests we got OOP wrong, very wrong.


I'm extremely aware of Alan Kay's statement, as well as the foundations of the actor model. The reality is that today that is not what OOP has become, and Alan Kay would agree.

> that suggests we got OOP wrong, very wrong.

Alan Kay very clearly states that people "got it wrong" and that OOP was supposed to be about messaging. ie: He intended for it to be one thing, but it isn't that thing.

> I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea. The big idea is "messaging"

> The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. [..]

http://wiki.c2.com/?AlanKayOnMessaging

https://computinged.wordpress.com/2010/09/11/moti-asks-objec...


> and that OOP was supposed to be about messaging

I agree that these are Kay's thoughts and also agree with his take on what it should be. But I think the reality is more complicated than it simply being something that evolved away from his grand dream. It's more that there was a soup of ideas floating around during that time that came together as OOP and the combination that became dominant was something else. For instance Simula was already using inheritance prior to Kay's message passing proposal.


I can agree with all of that, I wasn't trying to imply otherwise, more just explaining that stating that OOP means what Kay wanted it to mean is ignoring history, consensus, and Kay himself.


I’m very aware that we evolved things Kay may not have intended, and that doesn’t make it wrong.

He may have coined the term, but he doesn’t own it, nor should we feel beholden to his vision dating back to 1972.

What I said was if the primary reason for micro services is hiding of state, then we got OO wrong, because OO, even the much maligned J2EE style of OO, can do that for us if we want hiding of state and message passing.

Another possibility is that microservices do much more for us than hiding of state and limiting communication to message-passing.

At my 9-5, we use Elixir to write our services and have a few Actor-based Scala services too, so my feeling is that we actually are doing OO fine, and that there’s something else that makes microservices compelling at scale.


What you said was that Kay would disagree, and that we must have gotten OO wrong otherwise. Kay wouldn't disagree and Kay would say we got OO wrong.


What I said was:

> If we need microservices to accomplish isolation of state, that suggests we got OOP wrong, very wrong.

That’s the last sentence, which summarizes my point.

As for Dr. Kay, his exact words were saying that to him at the time OO was certain concepts and nothing more.

I have never interpreted that to mean that languages or systems that do more than hiding state and message passing are wrong, just that if we say something like “OO requires inheritance,” he would disagree with our definition of OO.

After all… Smalltalk itself has a lot more than hiding of state and message passing. Would anyone claim that Dr. Alan Kay would say Dr. Alan Kay was doing OO wrong?

I think there are good reasons to design microservice architectures, but if the argument is “Let’s break up our monolith so we can hide state,” I’d say that we can go ahead and just use our existing OO tools to achieve that.


> that suggests we got OOP wrong, very wrong.

It does, but that's because there are different flavors of OOP. Alan Kay's original take on OO was closer to the actor model than what grew into the mainstream spin on OOP with inheritance and the rest.

If you take 10 steps back and squint, microservices & the actor model start to look pretty similar.


So more like Erlang/OTP rather than Spring Boot Microservices?


> If we need microservices to accomplish isolation of state, that suggests we got OOP wrong, very wrong.

I don't think there's any question about that.

The only language that really seems to get this right is Erlang (and Elixir).

And I suppose Smalltalk, but I don't think even Smalltalk takes it as far as the Erlang VM.


Arguably, we did. In large codebases worked on by multiple teams it's not unusual to see teams drilling holes in OO walls because they "just want to get their work done" and view the abstractions as barriers. Taken along with the unfortunate fact that the majority of engineers suck at decomposing things into objects and the result is people preferring to move stuff out of process to keep the code simple and make the encapsulation more effective. I don't really think that's a good thing, but that's been my observation.


Erlang/Elixir feels like it strikes a middle ground, where every process behaves like one of Kay's objects, including the emphasis on message-passing rather than methods that behave like procedure calls.


Then I'll have to say that most get OOP very wrong, for example any time an application crashes for any reason.


The Erlang programming language is AFAIK the only programming language used in production that does what Alan Kay is talking about. So it is probably the only Object Oriented programming language used in production today.


What did he mean by "extreme late-binding of all things"?



That seems to be equating it with message passing, but messaging is already the first item mentioned by Alan (messaging, local retention ...) so I thought it might mean something else.


Network boundaries cause far more problems than they solve, and you've just shifted the complexity to now securing the network, usually with even more additional services, proxies, services meshes, firewalls, etc.


> Network boundaries cause far more problems than they solve,

They cause exactly 0 extra problems. A call from function A to function B can fail due to B having a bug. A call from service A to service B can fail due to B having a bug or a network failure. Either way, failure is possible and has to be handled - the network only makes that more obvious.

Further, a call between functions can cause mutated shared state - not the case across a boundary, they physically do not share mutable state.

> and you've just shifted the complexity to now securing the network

Not really. Fundamentally you have split your service capabilites up - now you can apply least privilege as you desire.


A failure is obvious all by itself. Network boundaries just turn it into a much bigger failure. And network failures are far more common and harder to test, handle and recover from.

> "Further, a call between functions can cause mutated shared state - not the case across a boundary, they physically do not share mutable state."

This is false as state is not tied to your process nor does it require a network leap to add isolation.

> "now you can apply least privilege as you desire."

How exactly? It's not magic, you still have to apply them, and now it requires more strategies and effort to accomplish.


I'm ignoring the first two points since I'm tired of explaining these things to people - you can read the papers/ watch the talks I've linked.

> How exactly? It's not magic, you still have to apply them, and now it requires more strategies and effort to accomplish.

Yes, we have tons of tooling for process isolation. Splitting a service into two services means you can isolate two processes instead of one, which means you break up the capabilities unique to each.

I used the word "apply" so I don't know why you're saying "you still have to apply them"... it's literally what I just said.


Securing the network is more work than just securing the code, because now there's a network in the way.

For all the repetition you have on this thread, can you summarize it with the actual benefit that you have gained in a serious production use?


> Securing the network is more work than just securing the code, because now there's a network in the way.

I very much disagree.

> For all the repetition you have on this thread, can you summarize it with the actual benefit that you have gained in a serious production use?

I already have summarized it. All I've done since is correct people being incorrect with regards to my summary.

If you want a specific example, here's a blog post I wrote a long time ago (the dates are incorrect since we moved websites): https://www.graplsecurity.com/post/architecting-for-performa...


The only potential benefit there was for security, specifically because of your security-based product and the async nature of its processing. And even that was just relying on the ephemeral nature of lambdas instead of other security constructs or simply resetting instances of a monolith to accomplish exactly the same thing.

Nothing in that article explained a clear need or benefit of microservices.


> because of your security-based product

Nothing in the article has to do with the product or the fact that it's security related, other than to provide a motivating use case.

> The only potential benefit there was for security

And performance.

> even that was just relying on the ephemeral nature of lambdas

I think you've failed to understand the article, which may be my fault, I haven't read it in a long time. The key is isolation. Ephemerality gives you a sort of temporal isolation. Splitting your messaging from your data storage gives you a capability based isolation. And so on.

It also means we can scale to the limits of S3/SQS - each service is itself stateless, the majority of state is managed in SQS, which could be quite loose about its consistency since every service is idempotent - arguably a form of temporal isolation.

What I've described in this article is effectively the actor model. I feel like I don't have to really justify the benefits of the actor model with regards to scale?


actor model != microservices.

What part of microservices (split functionality with completely separate runtime artifact deployed to separate servers) is needed for actors? You can have actors in a monolith.


No part of microservices is needed for actors and I didn't imply that at all. I said that microservices can be easily modeled as actors.


With a monolith you can put everything inside a database transaction and have a an entire request's worth of logic succeed or fail together. That's a lot easier to manage that having parts of the logic spread over multiple systems succeed and other parts fail.


So use transactions? I don't understand what part of microservices prevents that. In fact, transactions are pretty fundamental to reliable systems.

https://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf

From the abstract:

>It is pointed out that faults in production software are often soft (transient) and that a transaction mechanism combined with persistent process-pairs provides fault-tolerant execution -- the key to software fault-tolerance.


So distributed transactions with two phase commit, XA and all that. Imagine what, microservices hipsters tried that and it turned out too slow and cumbersome so they invented "sagas" which are still immensely more complex than a single transaction in a single database.


You seem confused. If you want a transaction use a transaction. If you don't want a transaction don't use a transaction. If you need a transaction across services, that sounds like you've run across a microservice antipattern. Monolith/Microservice changes nothing here - you can have the same issue in a monolith where two different functions are managing transactions and now you want a single transaction.

Literally irrelevant.


Well you can't use database transactions across multiple connections, so presumably this would involve you implementing your own transaction and rollback system. That's a lot more complexity than using a system that just works out of the box.


I don't really understand. If I have a database, and a service is talking to it, it can open a transaction. If I then want to talk to other services, and rollback that transaction based on what happens with those, I can do that.

Microservices changes nothing about this. If you want to remove transactions by splitting up your logic such that it operates in terms of sequences or something, you can do that, but that's just a choice like any other.


If the other services you talk to mutate state then rolling back those changes is non-trivial.


Well... don't do that? This is where microservices comes in. In a SOA architecture nothing really tells you when it's a good idea to split things up. Microservices is a methodology to help you avoid this exact situation.

You'd have the same problem in a monolith if you have two different modules working on the same db.


Assume you have two tables A and B on the same DB. They are sort of seen as unrelated. Suddenly a feature request requires that A and B are mutated together consistently.

If it is in one service you just use a common DB transaction and get it done.

If it is in one microservice for A and one microservice for B then you have to somehow implement this transaction yourself. This is possible but more work.


OK imagine that you have two different modules that manage transactions to a database. Now suddenly you need there to be consistent mutations between those functions.

Do you see my point? Microservices do nothing here - you have run into antipatterns that are universal, and that microservice architecture addresses explicitly as an antipattern.


I do not see your point. Sometimes consistent mutations between modules is wanted. Monoliths lets you do it. Perhaps you discovered your module boundaries were wrong, you create a supermodule to encaspulate both to coordinate the joint transaction and then split up later a different way and so on.

Module boundaries are refactorable.

Importantly, what if the alternatives you end up with achieve the same things with microservices end up causing a ball of mud of services, reimplementing transaction logic that belong in your DB in your homegrown network protocols?

You seem to say that some things are not possible with microservices and therefore this leads to cleaner code. My retort is that the kind of things one sometimes come up with as workarounds to make things still work for microservices are so complex that the cure is worse than the disease you wanted to cure in the first place.


> Module boundaries are refactorable.

Why are microservices not refactorable? This is the same exact issue in both cases. You designed something for a use case, the use case changed, now your old design isn't working. So maybe you merge those two services, or merge those two modules, or whatever else you want to do.

> reimplementing transaction logic that belong in your DB in your homegrown network protocols

Don't do that? I mean, again, this issue of "I wrote something the wrong way and now I have to fix that" is not any better or worse in microservices.

> My retort is that the kind of things one sometimes come up with as workarounds to make things still work for microservices

That doesn't sound like microservices. In fact, even the idea of having a database shared across services doesn't sound like microservices - it's an explicit antipattern. So it sounds like a bad SOA design. The point of microservices is to take SOA and add patterns and guidance to avoid the issues you're talking about.


Microservices gets owned by different teams, teams get cemented, politics get in the way of refactoring. Game over.

Sure, if you are a single team working on 10 microservices you can probably refactor with abandon without spending 70% of your working days in meetings talking about migrations and trying to sync strategies...

You may have experiences that that microservices are as easy to refactor as monoliths; in my experience it is orders of magnitude harder...

I think there is a bit of a "No true scotsman" fallacy at play here. You see something you do not like then it is "not microservices done properly".

How about state all the things you don't like about monoliths , then I say "that is not monoliths done properly", "don't do that" for each one?

I think both monoliths and microservices can lead to good code or balls of mud depending on the organization and developers involved.

Real question isn't whether "microservices done right" is better. The question is: does a Decision to do microservices reduce the chances of a ball of mud, when that Decision is then implemented by imperfect developers in an imperfect organization?

PS I always meant that each microservice had their own DB above, we agree on that and I never dreamt otherwise.

What I was getting at is that when you go distributed, sometimes quite complex patterns must be applied to compensate.

You may say the architecture is then "better", but on what metric? It is certainly more work up front -- so you start out in the negative and to become better the system and organization need at least to get to a certain scale, you need to save in the hours you invested to come out ahead.

In many scenarios the cost in developer-months needed up front is just as important as other factors in evaluating the best architecture. E.g. a scrappy startup simply should not do it IMO. Corporations..... perhaps;but I have seen it gone badly. (I guess it is just not "done right" then? See above.)


PS I think microservices excel in making people FEEL productive (doing work that is not directly benefiting the company).

I have personal experience with the same product built twice, once as a monolith by a small team that worked really well and once as lots of services.

The featureset and development speed is about the same, but the many-services requires 10x as many people.

However by splitting into many services everyone feels productive doing auxiliary and incidental work. Only those of us who worked on the first system are able to see that the total output of the company is the same but 10x as expensive.


> Microservices gets owned by different teams, teams get cemented, politics get in the way of refactoring. Game over.

I don't understand how microservices make this worse in any way. Modules get owned by different teams all the time.

> You may have experiences that that microservices are as easy to refactor as monoliths; in my experience it is orders of magnitude harder...

Yes, I have said before that I believe merging is fundamentally simpler than splitting. If we're just talking about merging a module vs a service, I don't believe either is harder than the other - I mean... nothing about microservices prevents you from using modules, and indeed I would highly recommend it.

> I think there is a bit of a "No true scotsman" fallacy at play here. You see something you do not like then it is "not microservices done properly".

For sure, and that's a failing of microservices. People think microservices means "SOA", or "write a lot of services". If you want to criticize SOA or whatever, sure, the argument of "don't do that" goes away.

> How about state all the things you don't like about monoliths , then I say "that is not monoliths done properly", "don't do that" for each one?

I probably could state a bunch of things that are pretty fundamental, but I don't think it's important - I don't know that I've actually said anywhere that microservices are better than monoliths, what I've instead said are the benefits of microservices that I see, which others have taken to mean that I somehow think monoliths or modules are bad.

> You may say the architecture is then "better"

I honestly don't think I've said that anywhere, or even made a judgment anywhere.

I think I can summarize, again, what I've said.

1. Network boundaries provide a physical layer that enforces isolation of state and the use of message passing

2. Isolation of state makes scaling a system easier

3. Isolation of capabilities makes securing a system easier

4. SOA inherently leverages the network boundary

5. Microservice Architecture is similar to SOA but with a bunch of patterns, guidance, and concepts that you leverage in your design

What I've received in response is a hodgepodge of:

1. "Modules can isolate state" - only true in some languages, and even then there's no physical barrier enforcing it, you're relying on developers to maintain that isolation.

2. "But what if you do anti-patterns that microservices tell you not to" - ok, that's why microservice architecture has books and documentation about what not to do. If you do those things, I'm not going to blame you, it's a failing of all methodologies when users have a hard time understanding them.

But so far the anti-patterns mentioned aren't really compelling or specific to microservices. You wrote code to satisfy a domain, the domain changed, now you need to change that code so that it satisfies the new domain. That happens all the time, merging services isn't any harder than merging modules.

3. General misunderstandings about state, security, etc.

> What I was getting at is that when you go distributed,

I'm not really convinced that "distributed" is the right word here. People talk about distributed systems being complex, and I think they're confused - what's complex is consensus, but splitting one service from another service shouldn't impact consensus, and the fact that they're now located on two different assets does not necessarily make things more complex.

Those services may be more complex, if your application was quite trivial - a totally stateless system with no external connections, for example. I see no reason to rewrite 'grep' as a microservice, and I would never recommend that.

Those services may now be more error-prone because you have things like dns, tcp, etc involved. If you don't want to make that tradeoff, that's OK, you could be right in that case. Again, no need to make all software be microservices.

(Going to respond to your other message here)

> PS I think microservices excel in making people FEEL productive (doing work that is not directly benefiting the company).

Maybe, I don't really know. It isn't my experience, but that's just me. Most developers seem to be pretty bad at their jobs so I imagine that all sorts of issues can be experienced. Certainly the idea of rewriting a monolith as a microservice seems like a red flag unless there were very specific needs.


absolutely - if 'microservices' had transactions, it would actually provide some real leverage for fault handling over a monolith


You drop in “…or a network failure” like it’s a rare occurrence that’s easy to handle.


Frequency is irrelevant to the complexity. As I said, you have to handle the idea of cross-boundary failures, such as with modules that have bugs.

If you aren't taking steps to do so, you're not writing robust code.

Anyway, yes, persistent network failures are rare for many people.


At some point, you can't gracefully handle bugs in other peoples code. If a function you call causes a SEGFAULT, in the vast majority of software, you're not expected to handle that. That's an invariant error, and you probably want some way to detect that it happened so you can fix it, but it's not reasonable to ask every caller of every function to handle that (in the same way we don't consider "the earth blew up" to be a reasonable thing to protect against, even though it its technically possible). There's simply not enough time and money to protect against every possible edge case in most software (NASA projects aside).

The argument here is that network issues are exceedingly common in microservice environments and so aren't actually an edge failure case, so you actually have to worry about them way more than you would worry about a function in a different module causing a SEGFAULT.


The point is not to handle individual bugs, it is to handle all failures. This is the difference between a "defensive programming" approach and the "let it crash"/ "zen of erlang" approach. Actors are designed such that they have failure isolation, which means they can react to errors in other actors without worrying about their own state. They then have two options based on one of two bug classes - transient and persistent.

Persistent errors are propagated to the supervisor. Transient errors are either retried or propagated.

It doesn't matter if it's a network error, a disk error, a timeout, a crash, a cosmic radiation bit flip - your approach is always one of those two. So adding more failure cases doesn't "matter" in terms of your error handling, although you may want to adopt helpful patterns in the nuances of "retry".

The frequency of errors will obviously increase with a network error (arguably very very little), but the pattern is fundamental to resiliency.

If your network is truly so unreliable that you can not pay that cost, don't do it. I don't think most people are developing on networks that fail for long periods of time frequently.


But now you are talking squarely about Erlang actors, not microservices in general. The runtime gives you all the needed guarantees here.


I talk about services and actors interchangeably because there's no interesting differences between them.


Other than automatic handling of network exceptions, safe failiures and the shitton of other features Erlang runtimes have?


I'm not sure what you're talking about. What automatic handling of network exceptions? What safe failures? BEAM has lots of great features, no question, but they have very little to do with the implementation of actors - BEAM primarily provides names and linking as useful primitives.


Network calls can appear to fail, but actually succeed.

Local function calls don't have to deal with byzantine failures[1].

[1] https://en.wikipedia.org/wiki/Byzantine_fault


Function calls can, of course, fail with side effects. Idempotency is always a desirable property.


> Frequency is irrelevant to the complexity

So if something happens 50% of the time you should treat it in the same as if it happens one in 100 billion?


Maybe? I can't compare 50% to 100 billion. If your computer crashed every 100 billion instructions that would be a problem. If it crashed every other instruction, that would be very slightly more (or the same amount) of a problem.

The point is that if you have a function call, you have the opportunity for a bug/ failure. Networks don't change that - you have the opportunity for a bug/ failure. The major difference is that services have stronger failure isolation.


It sounds like you’re designing a very hypothetical bit of software.


As opposed to designing software that is already implemented?


You've moved the state around, but the state is still there -- just hidden. It's hidden in the network communication instead of the function calls.

And the distributed, cross-service mutated state is a hell of a lot harder to trace and debug.


I didn't say it removes state. I said it split the state up and isolated it. That's critically important - you physically can not mutate state across a network, you have to pass messages from one system to the other over a boundary, either via some protocol like TCP or via intermediary systems like message brokers.

Joe Armstrong talks about this better than I'm going to: https://youtu.be/lKXe3HUG2l4?t=1438

That timestamp is rough, I just found a related section of the talk.

> And the network-defined state is a hell of a lot harder to trace and debug.

There's no such thing as network-defined state. I assume you're saying that it's harder to debug bugs that span systems, which is true, but not interesting since that's fundamental to concurrent systems and not to microservices.


I think you have a very narrow idea about what "mutating state" really means. You seem to talk about DMA access only. But you can manipulate the state of an application by writing to a shared data store, by calling an API, and countless other ways. It is really more of a concept for us humans to define where an application begins and ends.

Let's take an example. If we have two services that wants to keep the full name of a logged in user for some reason, that piece of state can be said to be shared between the applications. Should one service want to change that piece of data (perhaps we had it wrong and the user wanted to set it right), the service must now mutate the shared state. It does not matter whether it is done by evicting a shared cache or if we write the updated data to the service directly, we still speak of a shared state that is updated.

Now we can stipulate that the more of these things we have, the more coupled two pieces of software is, which generally makes reasoning about the system harder. It is not as black and white as one type of coupling is considered acceptable and the other isn't, but some types are easier to reason about than others. Joe really thought hard about these things and it really shows in the software he wrote.


We all share state in that we all exist within the same universe. But the universe has laws of causality, and Joe advocated that software should always maintain causal consistency.

A database is not needed for your example. You could replace it with an actor holding onto its own memory. But all mutations to that actor, which the other actors hold references to via their mailbox, are causally consistent and observable.

That is the premise of the talk I linked elsewhere.


> Fundamentally, microservices allow you to isolate state.

I think it's not really state isolation: your state is now spread across multiple separate services and a queue, which is objectively more complicated. To me, it's more the extreme version of things like dunder methods in Python or opaque structs in C: it prevents a specific type of programmer behavior. But honestly, it feels easier to solve this in code review.

Like, I agree it's bad to reach behind the public API of something, but microservices aren't immune to this. I've never worked on a microservice architecture that didn't have weird APIs just to support specific use cases, or had a bunch of WONTFIX bugs because other services depended on the buggy behavior. That's not fundamentally different than "this super important program calls .__use_me_and_get_fired__": you have an external program dictating the behavior and architecture of your own.

And you get multiple other layers of complexity here: networks, distributed transactions, separate dependency graphs, securing inter-server communications, auth/auth.

I don't think you're entirely wrong--there's a lot of history looking at state as a series of immutable updates (Git, Redux), and I think it is harder to "cheat" in this way using microservices. I just think it's far from a clear win.


> Fundamentally, microservices allow you to isolate state.

You can have logical dependencies between those "isolated" states anyway so I don't see that as a benefit really compared to say Java OOP private fields.


Re immutability -- I would say a well written backend in any language would (probably?) throw away the entire state between each request being handled. It's possible to introduce state, sure, but why and how does that happen? For very many backends the only natural thing to do is to code them stateless, keep all the state is in the database, and each new request starts in a fresh world.

I see two common sources of state in any backend (monolithic or not):

1) Caching, whether resources or flags, whitelists

2) Connection pools.

If there are ever any issues with those they can be segmented inside a monolith for a fraction of the cost of going to microservices (either using the same boundaries as if you had split into microservices -- or other boundaries, like just one set of caches/connection pools per endpoint handler..)

So I agree with the OP that the social aspect and development process is the only rational for microservices.

Otherwise, just scale the monolith horizontally to the same number of instances and you have strictly more ways to partition state; microservices only give you one way to partition state that may not even be the best one.


While isolation is important for managing state, the other side effect of isolation is allowing separate scaling of resources.

If you can scale up your number of workers for a particular emergency then things get easier to handle.


It also makes bin packing services much simpler for that same reason.


Organizing around software components is such an idiotic idea.

Good luck changing your design or making end-to-end features when you have teams protecting "their" part of the software.

You've just made sure that any significant change to the system requires coordinating many different teams, making it at least one order more difficult than before, and a lot more bureaucratic.


For my anecdotal experience, microservices is the introduction of fiefdoms in development.

And, I get it. Selfishly I would love to come onto a project and only have to worry about net-new development. It is hard to learn, and then positively contribute within another engineered system... of any complexity. When it's "my code" I tend to be much more productive, and it's easier to find my way through the forest.

Anymore, everyone is on a power-grab for "my tools" and "my way" in regards to everything they're working on with near-zero attempt to understand the base needs of the business/feature. Microservices fill this need well as devs can work with their preferred toolset and just setup I/O boundaries for everyone else to use their service. Also, this is good for resume-driven development as you can often claim creation/ownership over some "piece" which looks way better on paper vs. "contributed to a big 'ol project."

I'm not against microservices, I'm just being honest with how I've seen them socially play out - all of this is anecdotal to my experiences.


> For my anecdotal experience, microservices is the introduction of fiefdoms in development.

Fiefdoms, whether implicit or explicit, are pervasive in organizations beyond trivial size; IME, software architecture choices may impact the shape of the fiefdoms, but not their existence.


Analogy is in programming itself - divide up a program into modules that each have a very clearly defined interface. Then people can work in parallel on it. I know everybody agrees with this, and it is stupidly obvious. But time after time, I see source file after source file importing (or #include'ing) every other module in the project. A pull request to fix a problem affects 10 files instead of one.

It seems to me that people are simply terrible at modularizing code. (Including me.) I've seen this problem far more often than not, in every organization I've worked with, for my entire career.

Just the other day I worked on a function that did some Windows API calls for file I/O, allocated a bunch of memory, and processed the file contents. I split it up so the code that needed the Windows API did just that - it didn't allocate any memory, and the processing was done with a callback.

That meant the Windows API accessing code didn't need to import anything from the rest of the program, and the rest didn't need to import the Windows API.

(There are some success stories. The popularization of generic algorithms and ranges (or iterators) has been a big improvement. And the use of global variables has become frowned upon.)


> people are simply terrible at modularizing code.

Once you realize you needed it, its too late.

Then once you've been burnt on that fire for a bit, you go all out in designing the next few projects with framework and forethought and wind up with 100 lines of that for a 10 line script.

the moral is "life's a bitch"


That oscillation is necessary to be able to converge on the optimal medium for any given problem.

Otherwise we get stuck in "modularization bad" or "modularization good" mindsets, where both are wrong.


I agree; people are much more careful about splitting responsibilities and designing APIs between processes (REST, GraphQL, XML-RPC) than between code modules. Simply switching from a monolith to microservices makes them more thoughtful about the consequences of design.


>Simply switching from a monolith to microservices makes them more thoughtful about the consequences of design.

Well, in my experience it makes _some_ people more thoughtful about those things.


My experience has been a little better than yours. I've definitely worked on some spaghetti codebases, but I've worked on some really nicely modularised ones too.


Dropbox - monolith

Instagram - monolith

StackOverflow - monolith (at least until SO Enterprise)

There are many others.

There is a reason why we avoided "distributed systems" in the past. It's insanely hard. Microsevices solve some problems in theory but create actual problems in practice.

For most companies out there, distributed systems solve problems the companies don't even have. But, boy, do they need 10x more engineers to maintain this all.


An they pose(d) some attraction. A boss after getting tired developing the baby that turned into a monolith and brought enough wealth to the table started to outline a new system with 12 or so microservices. Each one could be designed and planned and all written out in nice 2 to 6 page documents - one for each service.

Now guess the size of the code wrangler team. And as this was so super-extravagant, guess the number of developers who were "promoted" to work on those micro-services.

> But, boy, do they need 10x more engineers to maintain this all.

Well stupidity saves you here. Ignore the one team per microservice rule and swap it with the "one-size-fits-it-all" developer rule: just promote one of the single team to be responsible for all micro-services. One lunatic that pushes code fast enough that management does not realize what is going on. Once everything is up and running, it just runs. The rest is downtime eventually, but you know what, its a software problem, what could be done? As long as everyone involved looks busy - and it's easy with plenty of micro-services to play whack-a-mole, bring one up again, kill two others, it is just a fail in the design as they should have been designed with failover but that was missed!


Dropbox migrated to Go microservices


Fair enough. At that scale it's different. My point is that many major companies were a productive team working with a monolith until it simply was not feasible.


Yes, you always want to start with a monolith then break out micro services as the API, scaling and deployment strategies, and team sizes become clear from running in and scaling in production for a good while.


> A lot of people working on one code base over the top of each other often causes problems and issues with ownership.

Microservices don't fix this. If I'm working on a feature in a monolith and I need to modify code someone else is also working on for some other feature that same situation would occur in a microservices based app. Just because the code is in a different git repo, and called with via an RPC doesn't change the fact that you need to change it.

Most software is monolithic, and most critical software is millions lines of code that hundreds or thousands of people have worked on - all without using microservices.


There is this amazing technology called “libraries”, “modules”, and “frameworks” that allows thousands of developers to work on the same monolith. It has been around for quite a while. No micro-services required. There is even this amazing technology called “automatic testing” that can test each library/framework independently before delivering to clients. I happens to use both of those amazing technologies at work every day and it works really well. I also happen to be sharing an office with a college who is fighting a 20 years old micro-services application. The worst piece of garbage I have ever seen in my 25+ years career. Every little functionality is running in its own process. There are hundreds of them. The complexity and difficulty of the system is truly mind blowing. If you think monoliths are hard, just wait until you split the same functionality into a distributed collection of micro-services. Welcome to the world of distributed systems. The additional failure scenarios just sky rocketed. Have a nice day!


> Microservices was always a solution to the organisational problem of getting more developers working on a system at once.

Hmm? You can do that simply with API boundaries inside a program. Linux kernel has huge development team that compiles a all binary.


Sure you can, and then you have a team that introduced a bug in common library that caused 100% CPU consumption on all cores, and tests did not caught it so somehow got into production.

In the monorepo world, your whole system is broken, it is all hands on board to try to fix this, everyone is stressed out.

In the microservice world, you have only one microservice which went down, so most teams don't have to worry.. in the worst case, they'll say: "Sorry, but we depend on service X and they are down.. blame them, nothing we can do". Sure, that team which introduced the bug is stressed, but the average company stress is much lower.

Having a successful monorepo requires organized, cohesive team with good communications -- or at least a team with a highly experienced people with veto power (this is the Linux kernel model). Unfortunately, a lot of real-life businesses do not have it.


> In the microservice world, you have only one microservice which went down, so most teams don't have to worry..

My experience has been that one service going down (or even running slowly) can lead to cascading failures where identifying the root cause is a slow painful process. I know that in a well designed system that doesn't happen, but that's the nature of all bugs, isn't it?


Well yes, you still need monitoring to be able to tell that the failure is external, but the key idea is that each team can do it separately, without having to get everyone's buy-in, or instrumenting every call in the system.

For example, we've had a batch processing system, with pretty relaxed latency requirements, and at some point we were asked to integrate with (internal) service X. The problem is, service X which would go down periodically. The solution was pretty simple: a simple centralized error logging service we already had, some asserts on results, and timeout on all HTTP calls. This works very well, for us at least. The service X still goes down every once in a while, but we can always detect that and explain to our (internal) customers that it is not our fault the system is down. Our customers were the ones who selected service X in the first place, so they are pretty understanding.

Is it a desirable situation to be in? Nope, in the ideal case, someone would go to team behind service X and help them to make service X reliable, with proactive monitoring, good practices, more staffing, etc... But I work are in the big org, and each team has its own budget, management and priorities. So the microservices approach is the best we can do to still get the work done under such conditions.


> In the monorepo world, your whole system is broken, it is all hands on board to try to fix this, everyone is stressed out.

No, not at all. And the kernel is a single repo even.

> In the microservice world, you have only one microservice which went down, so most teams don't have to worry.... blame them, nothing we can do

Well how is that better than just switching the dependency back to last known version and ship instead being dependent on a whole different team just to get dependencies fixed _and_ running.

> Unfortunately, a lot of real-life businesses do not have it.

That may be true, however benefit of those properties are in different kind of development projects, and this is not the question of whether monolith or microservices IMHO. Also not mono repo or non-source binary distribution.


And microservices do NOT require an organized, cohesive team with good communication?

If anything, there is MORE communication.

And how many teams who do distributed systems really know what will happen if a critical service goes down? The system is down - same effect.


The communication is in terms of well defined and documented APIs. Which having micro service boundary strongly enforces.

> The system is down - same effect.

Yes if it's a service many other services depends on, but not so much if it's near the leaves of the service dependency tree. In which case the system may still be up with reduced functionality.


You can use microservices to decouple things like deployment schedules, build dependencies, configurations, etc.


You mean "duplicate"?


I think the Linux kernel is a special case. For starters, due to the nature of the kernel itself, a lot of components are very intertwined. You'll be hard pressed to find one area of the kernel core in which you can work without at least having an idea of how other parts work. But for other components, such as drivers, you'll find they're far more independent and different teams do work on them, and update them separately even in out-of-tree module builds, mimicking a bit how you'd do microservices in that limited setting.


So is every piece of large software a 'special case'? Besides the Linux Kernel, there's every other OS ever written, office tools, browsers, video games, all sorts of SaaS apps. The idea that microservices solve some problem in that space is bogus.


It is a special case because there's nothing underneath the kernel. Different pieces of the kernel usually cannot interact through anything that isn't the single binary, there aren't more levels of abstraction.

But office tools, browser, video games, SaaS apps can benefit from the architecture. Microservices, at least for me, doesn't mean "different processes running in containers on AWS", but separate "services" that can be run and deployed independently. One easy example is login components, say any of those tools offers the possibility to the user to login with one or more remote services. Instead of putting that code in the same monolith, you could have a separate binary that gets called by the parent and spits out an authorization token. The separate binary can be tested, developed and updated separately.

Of course, you can't apply microservices to everything and have it make sense, but the ideas of separate tools deployed and updated independently are not bogus. I mean, the Unix philosophy could be understood as a kind of microservices architecture, with different independent tools managed by separate teams.


> the ideas of separate tools deployed and updated independently are not bogus

No, but the idea that you should do that because you have a big team is bogus. There are good reasons to break some applications up into services, and this is the least compelling one.


I think that having an architecture matching your organization is not that much of a bad idea. If the organization consists of multiple independent teams with different managers, deadlines and priorities, what would be easier? An architecture with separate, isolated services that communicate over some established APIs and can be updated, tested and deployed independently, or a monolith where just to release a new version you have to get all the teams to agree on it?


IMHO you should still take it the other way round.

What Conway wrote: You can use the development of the software as one tool to learn about the social interactions in the organization.

Only doing microservices and then saying: Hey look, this matches our organization is offering very little overall.

So what if the different managers, their deadlines and priorities is the cause of many more problems not only the wrong decision to do micro-services (exemplary)?



But that requires discipline.


Discipline, leadership, organization, and, most of all: coherency between developers.

Most orgs do not have these things - especially not at the level of the Linux team. Also the motives of OSS are completely different than that of your typical business - I personally think this keeps OSS team coherent/productive for far longer than your typical corp can.


Mainly because for a very long time, Linus reviewed every pull request.


Not sure if this story is true: Okta used to have a monolith architecture on AWS. All their 300 engineers used to work on the same service, so they had to run all their tests for every commit. The cost? $60 per commit.


When your system is critical enough, those kind of prices are cheap for what you get.

I can't tell you about Okta, but there used to be a (now apparently gone) article by Nelson Elhague about their test parallelization infrastructure. Many thousands of pretty slow test ran for every Jenkins build, massively parallelized across machines to get an acceptable response time. At the time he wrote this, there was no modularization to speak of, so every test really had to run for every build. Very expensive, but there's a lot of value on getting the equivalent of integration-level tests for almost every critical system automatically.

As a monorepo grows, the technologies used have to change to make its disadvantages spiral out of control: Serious modularization of codebases, something like Bazel to make sure code isn't overbuilt, and a testing strategy that finds the right balance of finding problems and keeping expenses in control. Every company with a monorepo ends up walking that road, but the good part is that by the time you have a problem, you definitely have proof that your organization is in a situation where the problem is worth solving.

This is very different from, say, what I saw at a company that had Fred George as CTO, and was going all in on microservices with one customer and about a dozen engineers, all of which wrote in the same JVM language. At that point, all the work that avoids running too many tests is probably going to be a net loss, and even more so when one considers opportunity costs.


... when your pipeline architecture becomes more of a cost sink that development into your actual product.


Sounds pretty cheap, relative to 300 engineers.


Assuming a dev is committing (ball park) 4 times per day, that's in the region of $55k/year per developer, or $16.5M/year all in. Thats not cheap...


I would not assume that 300 developers are all committing 4 times a day. That seems very high.


And there you have it...$60 per commit means slowing down your commit velocity! Which means its longer before changes get tested or reviewed or deployed.

The cost (money and more importantly time) of running tests fundamentally changes the development process.


More commits are not better.


Unless you are a PHP developer


4 commits per day per developer is exceptionally high, unless these are tiny, tiny changes.

I'm a very productive engineer on my team and I average less than 1 commit per day.


You only need to run CI on Production, and you can squash commits. Generally a release is once per week, and at Oktas level you have a release engineer.


300 developers' work being crammed together into a weekly release without having their contributions pass CI first sounds absolutely hellish for whatever poor bastard has to track down who broke the build.


There's tooling like bors that will rollup commits together and run CI on them before merging to master. If it fails then the commits will be retried individually.


I suppose so, but that cuts down on the number of CI runs when merging to master because the vast majority of commits being merged to master don't fail the build.

In a typical bors workflow, that holds, because you do a CI run before attempting to merge, so commits don't break the build except in the rare case that two commits that are individually okay combine in a way that breaks.

If you get rid of the individual run before merging, your combined runs will hardly ever pass, so it won't cut down on the number of runs.


Yes, I suppose I could be taking the anecdote too literally. It's much more likely they are only running all test on push to main / against pull requests / on code review.


That's what, 15 minutes of FTE salary+overhead?

Sounds pretty reasonable.


60x4x8x5x48 = $460,800/year full time salary+overhead?

I'm not sure that's right.


That's $230K year and 100% overhead

Sure, $230K a year is on the high side, but Okta is headquartered in SF, so it is plausible.

I cannot find a documented source for overhead, but in private conversations multiple people told me that 100% is pretty reasonable amount, and often is even higher.


> so they had to run all their tests for every commit

Did they _have_ to? They could have run tests for affected transitive dependencies, cached build artefacts, randomly sampled integration tests on master. There's plenty they could have done to reduce the bill.


Running the same tests using micro-services would be more expensive.


Microservices are not to scale organization even if some organizations used them in this way. There is a big problem with this way of scaling. The sense ownership becomes the sense of defend your castle (team's microservice code) from invasion (external developer that are trying to add features or fix bugs).

Microservices were also presented as a solution for encapsulation. But if you cannot write good encapsulated code in a monolith, what makes you think microservices are the answer? http://www.codingthearchitecture.com/2014/07/06/distributed_...

Microservices (or just services) are the solution of having more bandwidth for a set of features without having to more of everything else and preventing some part of the code to influence negatively some others.


> It is easier to start with a monolith, find the right design and then split on the boundaries [..]

This reminds me of the discussion on let pedestrians define the walkways[0] [1]

"As time goes on, we get smarter. We learn more about ourselves or our customers — what we or they really want. Therefore, we’re at our dumbest at the beginning, and at our smartest at the end. So when should you make decisions? When you have the most information, when you’re at your smartest: as late as possible."

[0] https://sive.rs/walkways [1] https://www.theguardian.com/cities/2018/oct/05/desire-paths-...


Feel like the conversation went something like this - - We need this thing done by <insert date> -- That's impossible - What if we hire more ppl? -- They'd just step over each other - Let's divide things up then -- ok

* six months pass by * - why isn't it done yet? -- we had to onboard and train all those new hires which sucked up a lot of dev time plus the new architecture added complexity and increased communication costs - How can we make our onboarding process faster?

* sigh and the cycles repeats it self


"What one programmer can do in one month, two programmers can do in two months." Fred Brooks


A technology reason to have services is that you can have large RAM/SSD caches on services and move that data off of your front end monoliths, shrinking the footprint of the monolith and keeping more data accessible quickly (although small caches for the very hottest data might be good to retain on the first tier). That data probably shouldn't be peeled off though until there's a software team there to maintain the new service, and it should really be living in its own data pool on the backend (you shouldn't be writing a service to front a single database table, you should be writing a service to front an entire database). If the data is large enough and important enough to be moved to a service, and there's performance wins by doing so, and if there's a team that can manage that service, then it probably is a good idea. A single team that is managing a monolith busting it up into a dozen services that they still manage, and all the data comes out of a single database server is probably the wrong way to do it.

There might be some other cases where you have hard business requirements and such for some software to be able to ship without other software and need to build clean lines and services so features can be extracted and probably some other edge cases, but the larger point should be that extracting the service should almost be demanded, both technologically and organizationally. If you have a choice and it isn't clear, then a service is probably the wrong thing. Services also really shouldn't be "micro" until you're an absolutely huge company and you can burn entire teams on small edge cases.


A better solution would be to build modules independently and have a single service use each module. Then things are unified, each team has a client it must deliver to, and issues must be fixed in order for them to be released in the final product. How are microservices better? Often each service is out of sync with each other, one service will upgrade and break another, that’s not possible with a single source and multiple modules.


Do people not version their API requests?


Yep and it gets worse, you’ll have teams add new breaking features or deprecate features without letting the other team know. Hey their tests pass so what do they care. I’ve worked with both strategies and modules over microservices are easier to integrate, ensure integrity, and reduces the probability that stuff breaks in production. Often microservices develop to a set of requirements and never consider the client that will be using that service. With modules that can’t happen.


> Microservices was always a solution to the organisational problem of getting more developers working on a system at once.

I’m not sure that that’s entirely true. To be sure that’s an aspect, but it’s also conceptually quite reasonable to desire the best tool for a particular job rather than employ a kitchen sink for everything. Separation of concerns is a commonly admired quality in code, and it makes sense sometimes to have this separation be at the repo level.

I don’t jive with the “only organizational not technical” characterizations this thread is rife with. It’s now hipster to say the cool shiny thing is lame, and it blinds some utility.

> It is easier to start with a monolith, find the right design and then split on the boundaries than it is to make the correct microservices to begin with.

This I definitely agree with in many, perhaps most, cases. But of course inertia, especially in organizations, could have future you lamenting such a decision you now have no easy way out of.


> Microservices was always a solution to the organisational problem of getting more developers working on a system at once

In other words, it's a great way to ship your org chart.

I've only ever seen a handful of examples where the microservice boundaries were set regardless on how the actual teams are organized in the reporting structure.


You'll ship your org chart whether you try to or not. There are two questions: a) will you architect your software to make this process (relatively) efficient? and b) to what degree are you willing to change your org chat to improve your architectural boundaries?


I think one can control the degree to which software architecture reflects the org chart. To take an extreme example, Chrome is primarily a single, giant (~200 MB) binary. I'm sure the layout of the source repositories reflects the Chrome org chart. But thankfully, they always had the engineering discipline to produce the architecture that was best for performance (a monolith, albeit a multi-process one) without letting it get out of control.


Where I've found "microservices" to be sensible is where you have multiple different clients to power (ie, website, iOS & Android). They all consumer the same JSON and render natively.

Otherwise, microservices just turn what was once easy into very complicated.


> Microservices was always a solution to the organisational problem of getting more developers working on a system at once.

Microservices was always a solution to the problem that SOA got swallowed by “implementing XML heavy standards” and lost attention to it's architectural principles.

(SOA was originally, and microservices a return to, a solution to reusing components and isolating faults so you need fewer, not more, developers and support personnel dedicated to any given system or collection of systems.)


>It is easier to start with a monolith, find the right design and then split on the boundaries than it is to make the correct microservices to begin with.

Not pleasant, but true.


But also, according to Conway's Law, you end up with a product with leaky abstractions that reveal the underlying micro-services and org structure.


Linux is a monolith. With more than 25 developers I believe. Seems to be doing OK last time I looked.


> Microservices was always a solution to the organisational problem of getting more developers working on a system at once.

Wasn't Continuous Integration (CI servers) created long ago to solve that?

Where developers can test and release their own changes, incrementally, without stepping on each other toes?

What is next?


CI helps releasing fast, it doesn't do anything about not stepping on each other's toes. You can imagine a case where two people are working on the same part of the codebase and having merge conflicts everywhere. CI doesn't help with that, though it does make it a little bit less stressful to resolve those conflicts.

Microservices forces boundaries, which in turn allows you to scale the number of teams to the number of service boundaries and now you're no longer stepping on each other.

With the disclaimer that I'm an embedded programmer so I don't have a dog in this fight, my reading of the literature suggests it's best seen as an organizational tool rather than a performance tool.


> In software engineering, continuous integration (CI) is the practice of merging all developers' working copies to a shared mainline several times a day. [1]

That certainly does help with avoiding conflicts. There are only so many conflicts that you can pick up in a few hours of work.

[1] https://en.wikipedia.org/wiki/Continuous_integration


Come to Erlang/Elixir (and OTP), you get the best of both worlds:

  - a mono repository
  - a single codebase for a single system
  - your micro services are supervisors and gen servers (and a few other processes) in a supervision tree
  - you decide which erlang node run which apps
  - your monolith can scale easily thanks to libcluster and horde
  - ...
Also, there is the midpoint between monolith and micro service, and this is called Service Oriented Architecture (SOA), you could have:

  - a DAL (Data Abstraction Layer) service
  - a Business Logic service (talking to the DAL)
  - an API (talking to the Business Logic service)
  - a Frontend (talking to the API)
Your API (or gateway, or whatever you want to call it) can serve as glue for third-party services (like Stripe, or anything unrelated to your business).

Microservices are a solution for an organizational problem, not a tech one. You need multiple teams to work on the same system without blocking each others. This is a solution for huge corporations, not your 2 pizzas team startup.


This has always been a beautiful draw of the BEAM ecosystem for me, but during my exploration of the space I found that most people were deploying BEAM applications the same way as any other, say, Rails app, and I couldn't find a "getting started" of sorts to deploy desitributed BEAM apps the way you describe. I'd be interested in resources like that, if you have any.


A few months ago I started a series of articles about Elixir, K8s and libcluser[0], but unfortunately I did not have time to continue it yet.

If you're using K8s, what I would suggest is:

  - create multiple releases of your Mix project[1] with specific OTP applications in each
  - one Docker container per release
  - one K8s Deployment per release
  - one headless K8s Service selecting the pods for all your deployments
  - use libcluster for automated cluster formation
Then, with node affinity/tolerations, you can control on which physical node you'll run it (if you want).

If you're not using K8s, you can use an Ansible playbook to deploy each distinct release on the target host, and you can use libcluster with a static cluster configuration. This will work the same.

[0] - https://medium.com/@david-delassus/elixir-and-kubernetes-a-l...

[1] - https://elixir-lang.org/getting-started/mix-otp/config-and-r...


I've worked at two companies using Elixir/Phoenix and both of them treated it exactly like a Rails app and there was no discernable BEAM-y ness about it other than how the library worked under the hood.

I asked the more senior Elixir devs if we should do something to take advantage of the platform and was met with shrugs.


Hehe. Technically cheating though, nothing in Erlang will ever be really a monolith. But it's a great solution for the space, it does require a bit of unlearning before you can be really productive with it.


"Also, there is the midpoint between monolith and micro service, and this is called Service Oriented Architecture (SOA), you could have:

  - a DAL (Data Abstraction Layer) service
  - a Business Logic service (talking to the DAL)
  - an API (talking to the Business Logic service)
  - a Frontend (talking to the API)"

For me what you describled is just a common monolith.

I've been developing for 8+ years and have seen this model at every company where I've worked. I really thought this was just the "standard" way that everybody used to develop. I've also worked on a couple of CQRS architectures but the "layering" is still very similar to what you described.

I'm curious now, how do other monolith architectures look like if not like this?


In a monolith they may be modules within a single application/binary instead of different services.

Nobody said a monolith can't be split in modules :)

IMHO, a well-written monolith is like a microservice architecture but without the network layer in between boundaries.


Warning: Compilation times suck for Elixir/Erlang. I'm frustrated almost daily when I make changes to my code and have to wait for 66 .ex files to compile after changing one single line of code in a heex template.


Why do the other files get recompiled? they should be recompiled only if there is change in them.


I have no idea. That's what I'm seeing when working with my project.

Literally add one character to one of my liveview modules:

    Rebuilding...
    Compiling 67 files (.ex)


It's worth your time to investigate this, it's fixable. Check out mix xref to investigate dependency chains between files in your mix project.

In the project I work on, it turned out a module `use`d by almost every controller included a function (totally unused, btw) which, through a little bit of indirection, caused a circular compile-time dependency between all of them. So any change to any controller caused hundreds of files to rebuild.


Never heard of xref. I'll take a look thanks!


Probably macros.


This was improved in a recent release. Did you try with the latest version?


Ha! Came here to say this but you beat me to it.

https://news.ycombinator.com/item?id=31328580


The monolith vs microservice debate always seemed so misguided to me. It seems exactly like vim vs emacs. Or ruby vs python. The arguments are always the same.

* "In vim I can do this in 3 keystrokes, in emacs it takes 5!" (where "this" is some vim-idiomatic thing you'd never do in emacs).

* "Ruby leads to bad programs, I have inherited a legacy ruby codebase and you wouldn't believe what they did..." / "Python leads to bad programs, I have inherited a legacy python codebase and you wouldn't believe what they did..."

* My team took a python/ruby app and rewrote it ruby/python, and now it's 99% faster and our productivity is way higher!

What I'm hinting at is this: You can write a bad or good microservice or monolith. The rules are different. You'll have different frustrations and tradeoffs. You'll have to play to the architecture's strengths and avoid it's weaknesses. You'll NEED institutional standards to keep people from doing the wrong thing for the architecture model and making a mess.


The big issue is microservices underwent a major recent hype cycle and are applied when they are a large net negative. All of your holy war examples mostly boil down to personal preferences. Microservices vs Monolith is a major architectural decision with widespread organizational impact. Ruby vs Python only matters if you already have a team that only knows Python for example.


Do people argue about productivity and performance when comparing Ruby and Python? I haven't seen that myself. Maybe for toy examples. For 10s of developers-scale applications, I sometimes see comparisons of tooling/libraries, eg Rails vs. Django.

But mostly I see comparisons with Ruby/Python versus compiled/more mature languages like Java/C#. It generally doesn't make sense to contemplate a rewrite unless you're moving to something significantly different, and Ruby and Python aren't really too dissimilar.


I think the biggest pain point for me is the almost constantly-changing definition of "microservices" - at work, whenever someone says "microservices" they really mean a services architecture. I can largely get behind that. Because when you say services architecture, to me anyway, that doesn't mean everything has to be a "service". Some stuff can be on the same machine if you do not have a firm point of separation between the services. Maybe they will share some code eventually... ok... put them together, who cares?

Last time I asked our ops guy to define microservices for me, he couldn't - instead he told me to read a 400 page book by some popular microservices preacher (who of course makes a lot of money by consulting for companies looking to use or currently using microservices ;))

If you cannot explain the general concepts to me, maybe you do not know what it is, and I think that is a big part of the problem.


Yeah, I think the most palatable definition of microservices is just SOA with no arbitrary lower bound on how much code a service needs to justify itself.


SOA usually involves a bunch of separate services, but all connecting back to the same shared RDBMS. That's the biggest shift in microservices - no shared data stores.


Yeah with some extra best practices thrown in - most of which I strongly agree with!


It's SOA with specific patterns and disciplines, centered primarily around the concept of the 'bounded context'. But if you want to just start with SOA and learn those patterns as you go, or dig into the book/ blog posts on bounded contexts, go for it.


"Micro" is a very unfortunate part of the name.


The first time the phrase was coined, it meant SOA using REST instead of SOAP.

Today I don't know wtf it means. I guess it depends upon who you ask.


Okay, Craig, you can have your monolith back when you can get your engineers to care in the slightest about keeping the codebase organized instead of turning it into a big ball of mud. I'm talking encapsulation. I'm talking well-defined interfaces that aren't glued to specific implementations. I'm talking some sort of dependency inversion, dependency injection, service locator patterns, any of that. I'm talking real use of the single-responsibility principle, all that software-engineery stuff that everyone ignores. Because I understand what you're getting at 110%, but by and large, all this stuff doesn't happen otherwise.

Until then you're going to have these things forced on you by your local Architect, forced on you by running a bunch of separate processes on separate containers with DNS as the cluster's service-lookup framework.


I'd rather have one codebase that's a big ball of mud than 10 mini balls of mud that combine to make one immensely complicated ball of mud. At least a stack of messy method calls is much easier to debug than some data that travels through 5 different systems before it gets to the user.


How about, you can have microservices back when the engineers/architects care in the slightest about defining domains between services and having a direction or overall idea for how they communicate.

It's easy to get to a point where you have a big ball of mud of 100 micro-services where each new features touches 20 of them, and just hack in whatever new APIs everywhere that's needed to push a new feature out (but probably buggy due to async state and race conditions that noone was trained enough to figure out up front).

All that is pretty equivalent to what you talk about with monoliths.


It's just about need. Most people don't need microservices. Let me repeat that. Most people don't need microservices. It's only relevant at scale. The scale I'm talking about is organizational scale and then potentially technical scale but it's mostly about solving people problems by enabling independent development of products and features.

Developers are their own worst enemy. They love shiny buzzwords and using unnecessary tools and concepts just to say they did. Conceptually you can't blame architecture patterns, that makes no sense. Blame those who choose to adopt patterns for the wrong reasons.


What would be the difference between microservice and a monolith with separate folders, separated by team? If teamA calls teamB's code, they have to import. If teamA needs a change to teamBs code, they submit a PR?

Microservices are not needed for scaling an org .


Have you ever shipped a monolith across 10 product teams? Everyone is working on different features, different branches, etc. The monolith becomes the bottleneck and you end up architecting processes to cater to its weaknesses where it should actually work to your strengths. Microservices inherently build that in, not just into the development model but the entire workflow from source to running and beyond. People are often disillusioned by the idea of the breakdown of the source code but fail to understand that the technical platform and processes in the pipeline enable independent product teams to move faster.


i use one today with more than 20k engineers


SLAs per product.


metrics per product – no micro needed


Single RDMS database + stateless middle tier is still best solution for 99% of applications out there.


Agree.

But even fighting for RDMS in 2022 is getting daunting as so many people will immediately start pushing for document DB's, GraphQL, etc. Seriously - I feel like SWE's have no idea the powerhouse a modest MySQL/Postgres instance can crank out while perfectly maintaining their data (ACID, replication, etc).

Everyone wants to use tools that were designed for FAANG scale regardless of org size. Anecdotally I've fought this hard because it's almost always at incredible detriment to the business's needs.

I find that modern devs balk at "single RDMS database + stateless middle tier" but only for selfish reasons like resume-driven development. This pattern is really burning me out more than anything else of my career.

Again, all this is personal anecdotes.


I've found the largest culprit here is that people don't understand how many orders of magnitude are between them and FAANG scale. It's akin to how people don't understand the difference between a millionaire, a billionaire, and Elon Musk.

So they'll see some numbers that are on the upper end or beyond what they're used to seeing, and to them that's big scale. How do you solve that? Well, you reach for the tools other people have used when they have "big scale". All the literature says you can't just increase your instance size when you have big scale, so why would you try that?

And yes, that's wrong. But it drives a lot of the mentality.


"If you have to ask if you have big data, you don't have big data. When you have big data, you know you have big data."

Seriously, you can launch a server that has almost a terabyte of RAM! Is your business doing enough to make that database even blink? "My sources say no".


i have worked on several not-faang-sized-but-largish b2c startups, all of which were forced to spend a very large amount of money, time and people trying to migrate things off the one central RDBMS when it could no longer be scaled vertically. This is a real problem that you likely will run into well before FAANG size, and it really does hurt, and i think it gets minimized by the "just use one postgres for everything" trope. You should probably think about it at least a little if your planned business trajectory is into the 1 million+ users realm.

Im not saying you have to use nosql or whatever, but some sort of small future proofing (eg introducing different schemas/permissions for your different data domains inside your monolith) can help make future database seperation much easier.

If you're doing a B2B app its a different story, much harder to hit the limits of a big RDS instance.

The real answer, as in most software engineering is "it depends"


How long ago did these startups hit the limits of vertical scaling? Those limits are always increasing. Expensify's famous post from 2018 [1] suggests that it should be feasible for one monster machine to handle millions of concurrent users.

[1]: https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...


theres one that im still working on today :) for practical and all the other usual reasons most people are going to rent cloud dbs (like RDS), not run bare metal. Also, theoretical/benchmark performance is a very long way away from the actually achievable performance when you have 100s of developers actively evolving a codebase against the database. Optimal query usage etc in that scenario is an impossible pipe dream, best you can do is "not crazy", unless you grind all development to a halt behind gatekeeping dba-types.

There are many other operational challenges to single very large database instances. Upgrades, backups, migrations, etc all become way more risky and hard when you get into the mega-db range. The number of total outages ive seen caused by migrations gone awry, query plans gone awry... RDBMS are incredibly complicated beasts and at huge sizes can be very unpredictable in a way that many smaller dbs are not.


> The real answer, as in most software engineering is "it depends"

Of course. It's not as if only FAANG has scale issues.

However the set of people who believe they have scale issues is orders of magnitude larger than the set of people who have legitimate scale issues. People should start by assuming they're not in the second category is my point.


Micro-services is about partitioning based on functionality and features -- so I think more often than not, 99% of the microservices use 10% of the storage and 1 microservice use 90% of the storage, and you're back to solving the problem of horizontally scaling your storage regardless of microservice vs monolith.


thats sometimes true, not always. but a good observation. Even if it is true, having that 90% usecase behind its own service often allows a lot more flexibility to scale for that particular storage problem , and not have to use it generically/everywhere. Eg, time-partioning, cold archiving, sharding, whatever is suitable for just that use case.


There is another aspect to this - if you're working on a startup that will scale, if successful - not to FAANG scale, but let's say many millions of DAUs and thousands of QPS on the servers - you also want to build some kind of future proofing into your architecture.

And then begins the dance of balancing what we currently need (we have maybe 100K DAU and launched recently) and what we predict we might need if things go right. So you don't need to go full on K8S with some crazy sharded database and 50 microservices on day 1, but you also don't want a PHP/MySQL monolith if you're building something that works like, say, Twitter, because you'll have to scrap ALL of that under huge pressure if things start to take off.


> 100K DAU

> but you also don't want a PHP/MySQL monolith if you're building something that works like, say, Twitter

This is a lot of whiplash in regards to discussing size.

I just want to say...

100K DAU with a PHP/MySQL monolith is 100% possible - I'm certain a lot of people here have achieved this without much problem. Things like Nginx, PHP-FPM, reverse-proxy-caching, load balancing to stateless application servers, read replica, offloading static assets to CDN, blah blah... I digress but you can 100% hit 100K DAU with a traditional PHP/MySQL monolith.


Yes, my point is it will hold, but it won't hold 50M users, and if you take off you'll be scrambling to rewrite it from scratch. So a flexible architecture that is simple but can scale (as an architecture) might be a better choice than the simplest possible design you can make.


Could not disagree more. If you are a startup that is struggling to survive, it is an absolute waste of resources to overengineer your app for potential scale. It's the VC way to burn all of the money hiring talented infra engineers to manage this, but it is not pragmatic at all. The exception is if your startup is literally a high volume data processing platform or something similar.

That simple PHP/MySQL or rails monolith will scale much further than you think by throwing more servers at it without having to hire an army of devops engineers. Solving problems you don't currently have when your company is not profitable is a waste of money.


Maybe the monolith shouldn't use PHP and MySQL. But what about, say, a high-performance managed runtime (e.g. JVM or CLR) and SQLite? Expensify's findings [1] suggest that with vertical scaling, SQLite can go very far.

[1]: https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...


Would you agree that caching fits into notion of 'stateless' ?


Yeah as long as has very little complexity.


plus async tier


This is one of the big reasons I got into Elixir.

The way that you're encouraged to architect your code makes it really easy to separate a specific piece(s) if you need to. The functional, no-side-effects approach combined with Elixir's ability to easily communicate between nodes means that if I need to separate a particular set of functionality to certain servers...it's just moving things around.

If I take a function call_me(arg1, arg2) and it returns a result without side effects, it's no different for me to say call_me(node, arg1, arg2) because it's still going to give me the same result just from a different server.

This flexibility means that I can comfortably built a monolith and not have to worry about having to untangle it later if I need to. I love it. It gives me long term peace of mind with short term productivity.


I wonder if anyone who wants to bring back the monolith has ever worked on a true large monolithic codebase? Not something that's like 5-10k lines of code excluding frameworks, but monsters that are 100k+. Where test suites take an hour to run end-to-end and you've got 50+ devs all issuing pull requests in the same codebase? I feel like webapps usually have a sweet spot in terms of size and logical reach. This whiplash hype cycle of "X" and "anti-X" just exposes the ever lingering problem of letting blog posts on HN determine your architecture decisions.


I've worked on large monoliths (100k LoC is not "large") and have a strong preference for them because I've also worked on large microservice architectures. A monolith with a strong architecture and development process works well. Microservice architectures are often a fig leaf for poor architecture and development process, while introducing new problems and increasing development effort. It is always less expensive in the long run to fix the architecture and process problems than to sweep them under the rug with microservices.

I understand that microservices often become necessary in organizations when they are constitutionally incapable of addressing the root issues directly. But we shouldn't normalize introducing microservices so that we can ignore poor architecture and process, which underlies the majority of cases I see.


At my first job I inherited a 250K LoC monolith which I had to continue to develop and also operate. It wasn't great but I got by. I don't think I could have managed if that 250K LoC had been strewn across a dozen microservices with all the accompanying complexity and extra degrees of freedom in deployment and operation.

SoA is superior when you have large numbers of devs coordinating and working on the same product. For smaller teams monoliths seem to be the way to go.


Sure. Used to be, that's how it was done. Some things sucked about it, others were easier. For example, not much in the way of versioning problems. On the other hand, if somebody broke the release, nobody's new feature made it live. You also had to allow for extra testing time...continuous deployment D is not a great strategy in this environment. We had a week of integration testing for release candidates, unit tests, smoke tests, data integrity tests, UI tests, performance tests, etc. We also had a QA team that signed off on releases. It was actually pretty rare that you collided with other developers because teams were focused on their own areas of the codebase.


Usually codebases are organized into modules/classes/etc and most test suites can run specific test cases (or groups of tests, specific domains, etc).

I think if you can't do that in a codebase, then one has bigger organizational problems than the technology itself.

Also assigning many developers to work on the same set of files is a recipe for disaster (if said developers cannot get along or organize themselves) while working on the very same place/file.


I have. LoC was in the millions. Half a dozen languages too for the different parts of thw monolith. Worked on by a team that peaked at 100 or so. Multi hour build times too. SVN too!

Monolith is still better in my view, despite having experienced that.

The key IMO is to have a clear and singular vision and how all the different parts need to fit into it. If you let each little clump of people do their own thing, then you will have chaos.


Ten years ago I worked on a monster that was half a gig on checkout. Assets, everything. It was an e-commerce site. And it was beautiful. Single set of dependencies, easily tested, one CI/CD pipeline.

It was an egregious Spring/Struts1 app, and it was a breeze to develop on. Fixed my first bug on day 2 of the job. With microservices, would have taken me 3 months.


2G here.


> I wonder if anyone who wants to bring back the monolith has ever worked on a true large monolithic codebase?

Probably not, that's really the point :)

Although I think the limit is significantly higher today. Monorepo tooling is orders of magnitude better than 10 years ago.


Can you recommend something good in that area of monorepo tooling?

I read about Bazel, which looked really nice. Any other interesting tools to check out?


At a certain scale monoliths will no doubt break down but I believe the author and many other developers are bothered when monoliths are an optimal solution for their scale and problem space but end up replaced by one that's suboptimal.


I have as well, and honestly I'd classify something that's 100k+ LOC as a "mega-monolith" vs. what devs would traditionally consider a "monolith"

You don't get to that LOC count without huge teams and decades of time. Any project of that timeline/complexity/size should constantly be having functionality split off so that organizational units can be better groked/understood. To put it a different way, when projects hit this sorta size I have experienced them naturally shedding off functionality that makes sense to be external in a service-based development strategy.

These are the situations where it 100% makes sense for microservice architecture as you've hit the social/technical complexities where breaking things into organizational units makes sense.


Wait, what? 100k+ LoC is more like a "two-pizza" team and a few years, empirically. And this is in C++, a language where LoC don't flow nearly as quickly as in other languages. Even extremely complex code bases (think database engines) are entirely manageable at this scale and beyond without introducing microservices. Monoliths will scale into millions of LoC with little loss of effectiveness if your code base and processes are setup correctly.

You are describing the consequences of poor architecture and process, not anything intrinsic to software projects.


> empirically

Right, and our empirical observations over these topics are different.

I just read your profile and frankly the projects/industries that we are involved in are very different, only confirmed by throwing "C++" out there - the closest I get to that is some hobbyist C development.

I apologize for agreeing with parent comment's generalization at 100k LOC being a "mega-monolith", along with what I added with team sizes/decades for timelines. Empirically and anecdotally this is what I personally have observed, specifically in regulated fields ie: banking, payments, and education. This is a perfect example as-to what happens when I don't include the word "anecdotal" in my HN comments...

Developers from different industry niches have very different ideals in regards to what a "monolith" is, or what a "two-pizza" team is capable of over time. I find your "wait what?" comment to be a bit dismissive, only to end your comment directly telling me what I am describing which clearly implies I don't know what I'm talking about.

We likely just have different experiences with project complexities, sizes, teams, etc. Sorry if this comes off as aggressive.


I feel like I am going absolutely nuts reading this thread. 10K LoC is 2-3 months of work for a single person hobby project. 100K LoC is maybe a 2-3 person dev team for 2 years. That's assuming a refactored codebase with no copy-paste repetition. "Append only" codebases with heavy copy-pasting can easily get larger than that in shorter amounts of time.

Huge teams and decades of time gets you to 1M+ range, not the 100K+ range.


In a large corporation with tons of people constantly reinventing the wheel for no reason only for resume bullet points... where you spend two months on test specs before putting down a single LOC because "TDD", where you have to work with regulators, auditors, etc...

I feel weird defending myself but... yeah. This is actually my experience.

I get that teams at FAANGs/startups can typically move faster but for your run-of-the-mill corporate America development teams people do not move that fast.

Personally, I have only moved that fast at startups which is why I typically find myself at them.

Again, this is anecdotal to my personal experience.


That makes sense. It just shows how much variation there is across different areas.


I only wish the monoliths my customers were dealing with were tiny 100k LOC projects. I've seen 5MLOC+ nightmares filled with all the terrors you'd expect like unused libraries and dead code that looks should be unreachable, but actually is and creates security issues.


I relate so much with this article I've had a previous experience with a backend monolith repo, and these days I have to deal with a backend that has 20+ repos. It is hell. Duplicated code, duplicated logic, duplicated tests, duplicated settings, an hell to introduce newbies to the architecture, async calls to external basic APIs that could've been just simple method calls.

I think that the one major disadvantage of having a big monorepo, is that with those multiple entry points, you might end up with a bunch of unused dependencies. But even that is manageable I think: you can have different package dependencies definitions whilst using the same codebase.

I've always worked with small teams (max up to 5 or 6 developers) and that's another point in favor of monorepos. I understand that big companies might want to have different teams working on different repos, for organisation reasons.


They don't have to be micro - they can just be services.


On the NodeJS / React ecosystem - no one really wants you to do monoliths.

NextJS, for example, doesn’t let you manage service lifecycle methods (unless you write a custom server - which they explicitly warn you against). NextJS wants to be special, that’s dumb and it sucks (well, in the context of using it to drive your SaaS platform, it’s pretty smart).

Remix, for example, wants to control your client/server API calls, so you’re hard pressed to use other tools like GraphQL, and incur the risk for growing into microservices if you need it. Same story as NextJS about being special and probably driving SaaS platform sales.

Amazon just makes you manage an alphabet soup worth of products, which are all pretty expensive - unless you want to just lambda it… meaning custom runtimes and … back to microservices.

Point: microservices aren’t just driven by your org anymore, they’re pushed by vendors.


Hey, Lee from Vercel. I lead the Next.js community. We do encourage folks to use monoliths, if they'd like! That's partially why we believe in Turborepo so much (happy to share more on that if you'd like).

Are there specific lifecycle methods you want to tap into with Next.js? Open to feedback there. The reason the "custom server" is not recommended (it's poorly named, I'd argue it's an "ejected server") is because the main use cases folks were ejecting for have since been merged into Next.js core (like internationalization).


Hey Lee – imagine that I've got a database I need to initialize on startup / teardown on shutdown. Where can I do this? It's inherently an async concept. There are no hooks.

The docs for Custom Servers [0] begin with a bunch of language and warnings strongly encouraging you to forego this route. The consequence? I don't know – it feels like it's not maintained, and maybe won't be there between major versions.

[0] https://nextjs.org/docs/advanced-features/custom-server


What are you looking to do on server start and shutdown?


Isn't an average next app pretty much a monolith? It gets split into lambdas when you deploy it, but the development experience feels monolithic to me.

If I want to use some function, I often just import it and call it directly, instead of calling one serverless function from another over http. That means the function gets bundled into a few lambdas, but so far, I've had no issues with that. The monolith abstraction hasn't really leaked for me.


I'm all for a monorepo – or to a lesser degree, utility functions that are re-usable across components. That's not what I'm talking about.

Running NextJS as a handful of lambdas is fine, but you still likely need to initialize other services in the background (like databases)... otherwise you have a monolithic front-end... as in: "my monolithic front end talks to my monolithic back end, so I really have two micro/macro services", and maybe I should investigate breaking the FE/BE down into smaller components".

sure, you can use pgbouncer, etc. but it becomes sorta cost prohibitive to run a few apps.


Monolith + a database, that's always implied, isn't it?

How you make the database connection work with serverless connection requirements seems like a minor issue. I haven't used it but it seems like next + prisma (+ tailwind lol) is the default stack of tutorial writers these days.


"How you make the database connection work with serverless connection requirements seems like a minor issue."

It's really not. See... databases like PostgreSQL aren't designed for an infinite number of simultaneous connections (doing things like, checking for a logged in user, and auth). This is where something like pgbouncer comes in – that at least takes care of transaction boundaries – because we're a monolith, remember?

Now you're also setting up the state of your application as global variables. Seems like a pretty poor choice, and now you also need to figure out the gotchas with HMR here.

Let's not even get into the fact that a lot of code isn't meant to be (or cannot be) bundled by webpack. So now you're resorting to a custom Webpack config to exclude certain modules. [0]

On AWS, pgbouncer is called "RDS Proxy" and runs you $11.16/mo. [1]

That's not even getting into the cognitive overhead of just "rolling out an app over a weekend" now being a weekend adventure in and of itself – again managing an alphabet soup of services either all within AWS or between AWS and Vercel.

[0] https://nextjs.org/docs/api-reference/next.config.js/custom-...

[1] https://aws.amazon.com/rds/proxy/pricing/?nc=sn&loc=3


NextJS is an elaborate advertisement for Vercel's "serverless" cloud. I really can't feel sorry for anyone at this point that gets suckered into their sham. As a framework, it's also going to force you into a specific pattern.

Skip Next and go straight for Express. NextJS does nothing useful that you can't easily do yourself, and only makes life difficult down the road.

> they’re pushed by vendors

yes, Apollo with GraphQL, Vercel, etc. It's a huge cottage industry. But this is how IT has always worked. Just look at some of the garbage Oracle and IBM have been pushing since the '80s. If your org is falling for a slick sales pitch or the latest Google cargo cult, then that's on them. If you're in IT, then it's your responsibility to educate the execs on what they actually need. If they would rather listen to the sales guy over at Oracle than you (often likely), then it's time to move on.


Hey, Lee from Vercel – I lead the Next.js community and am focused on making sure Next.js works great self-hosted (e.g. Node.js server, Docker app, etc) with support for all features. I've written docs[1] about it as well as made videos[2]. Open to any feedback you have on how we could make this more clear!

[1]: https://nextjs.org/docs/deployment#self-hosting

[2]: https://www.youtube.com/watch?v=Pd2tVxhFnO4


I'm not a huge fan of Next and/or vercel. I'm in particular pissed off by the shady "call home metrics" thing they intentionally make it difficult to disable.

But suggesting that you can use express instead of Next, to me, is a clear indication you have absolutely no idea what Next is.


Yes, corporate-funded OSS has a way to make things work well on their paid product.

But Next.js is also just a well-built framework that won some hearts and minds because it's ...a good framework, not because of some extremely slick salespeople.

And if being a technical contrarian is interesting, you can always use it with one of Vercel's biggest competitors: https://www.netlify.com/with/nextjs/


Remix seems to be a counter-example. The jokes app tutorial [1] is as monolithic as they come.

[1]: https://remix.run/docs/en/v1/tutorials/jokes


Monolithic – yes. However, my concerns with Remix are that it intentionally doesn't align well with the premise of GraphQL – for better or worse. However, that's a very common pattern when you do need microservices.

I.e. Remix doesn't have a good story for off-boarding from a monolith – so it's inherently risky to build on top of.

GraphQL really isn't that hard, and if I'm going to build an API anyway, why not just start with a piece that's going to eliminate the endless refactoring of data-stitching when the needs of my pages (or underlying components) change?


This is dated 2019-03-13 and it says "It feels like we’re starting to pass the peak of the hype cycle of microservices" but wasn't the peak already earlier? Looking at HN posts - https://hn.algolia.com/?q=microservices - it looks like a lot of the high voted posts against microservices were 4-7 years ago.

Highest scoring one (2018): https://news.ycombinator.com/item?id=17499137


From what I've seen over the years HN is on the leading edge of trends. If one really monitors the bleeding edge it doesn't feel that way. But people tend to forget the very very very long tail behind the curve.

We went through a hiring surge a year or so ago and I was astounded how many candidates had never used any cloud, GitHub, containers, micro services, modern tooling or language versions, etc. These are all things I assumed everyone had been using for several years now. Or if not, they'd already tried it and decided to move elsewhere. It was eye opening.

On their on premises monolith skills will come in handy with the bleeding edge companies in a few years.


The HN hype thermometer I think is still a good signal, like here at detecting when fads are over the hill.


I remember a talk a few years ago where the CTO of a local startup was gushing about microservices and how productive they made his team. Sounded like an tech evangelist who had drank their own koolaid.

Then one of the "questions" in the Q&A was a pretty aggressive attack based on how unproductive the skeptics company had been to date and how they were on the verge of failing.

The speaker asked how many development teams they had and what was the size of their DevOps/Tooling team(s). When the skeptic admitted they only had a few developers, the speaker recommended they IMMEDIATELY pivot to Django/Rails/Node.js/.Net; "whatever you are most comfortable with". And then said "Why are you still standing there? You need to pivot tomorrow morning."

I think of those two questions every time I read or consider microservices. "How many teams. How big is your DevOps/Tooling team."


Why do we organize by shape, and not by function in our monoliths? We put all our controllers in a package, all our DB accessors in another package, all our "data objects" in another package. Even in OO languages like Java where we have package accessors that never get used because we layout our code such that package scope can never be used. We instead layout the code according to "shape".

When figuring out what to spin-off as a microservice, we do this by functionality. Why not just make it easier in the monolith and organize by functionality instead? Let's act like 4yr olds, not 3yr olds: https://pubmed.ncbi.nlm.nih.gov/12090481/


The wording you are looking for is called "Vertical Slicing". After using it, I'll never go back to "shapes" anymore.


Thanks, I didn't know there was a name for it. But every single code example out there is an anti-pattern to this as far as I've seen.


One wrinkle I liked was to build a monolith, distributing the same bits everywhere, and to configure the actual services to start on an instance-by-instance basis. This simplifies distribution (e.g., "unpack this single zip file, it's got everything") and you can rapidly change what is running and where with a command-and-control system of your choice.

I can't imagine having different build products and deployment stories for every service type, nor can I imagine institutionalized version skew of more than a couple of weeks.

This probably doesn't scale to large teams, but it let a small team work pretty effectively with thousands of microservices.


You're advocating monolithic packaging (simpler), with multiple services (more flexible). Sounds great!


> Setup went from intro chem to quantum mechanics

Doesn't really seem fair. On the one hand you might need to do a bit of work so that you can say `docker-compose up` or `nomad up` or whatever, but at the same time there are plenty of issues with running binaries/databases directly on a laptop - version skew, for instance.

> So long for understanding our systems

This is fundamental to all asynchronous systems. You don't have backtraces anymore. If your service has concurrency primitives you probably already have to solve this problem with tracing, microservices just give you another asynchronous primitive.

> If we can’t debug them, maybe we can test them

Bringing up your entire application is what you'd have to do in a monolith as well, I don't understand this criticism. Also, "teaching" your CI to do this is 0 additional work - it's gonna be another "docker-compose up" or whatever, generally speaking.

Our microservice codebase runs on laptops just as it runs in the cloud. It's pretty nice.

With regards to "That is probably a bit too much effort so we’re just going to test each piece in isolation" - again, this is the same thing with your monolith. You'll just do this at the module level.

This is really a "right tool for the job" situation. And that's hard for people to understand, since oftentimes you don't know what you're building upfront.


As a junior engineer (only working 6 months), the K8s side of things has been my biggest barrier for learning. On top of having to learn about software engineering practices, I’ve had to learn helm files, deployments, services etc. It was/is very overwhelming. I know I’m new and naive but it seems needlessly complicated and it’ll be another few rounds of abstraction beyond K8s before people are happy with it.


Welcome to software development. Expect it to be like this forever and continue to get worse. Tech seems to have 5 year cycles in which a problem has gotten enough peoples nerves over the previous 5 years that a new abstraction comes out to solve it which people then adopt and develop a love/hate relationship with over the next 5 years and wind up either inventing or adopting the next abstraction to deal with the current frustrations.

It's not all bad. In my 8 years in the industry there has been an incredible shift in what a single person and/or small teams are capable of versus the status quo at the time. It comes at what I can only describe as a Schrodinger Cost - i.e a cost that is sometimes worth it and sometimes not.


It's important to separate out microservices and kubernetes. They aren't the same thing, and Kubernetes isn't the only way to run microservices. Tons of companies were running microservices prior to kubernetes ever launching.

The core idea is that services communicating via an API that have self-contained infrastructure is easier in a large organization, easier to scale, and allows you to make the right technical decisions for the problem.


Both styles have their places. I've done migrations of monolith to microservices and I've done microservices to monolith. In the end is about what's best for the project/client, not the purism of "I only do xxx or only yyy". If you're one of those developers then don't call yourself "senior", you're still in the mindset of a junior.


Agreed 100%. Another pain point that prevents developers, who love to think they are rational people (lol) to even start to have a productive conversation about microservices.


Hey, the longer it takes to understand, iterate, and work on a particular architecture, the more money you're making per task completed as an engineer... so the more ridiculous the architecture with which you're working, the slower your work is, the longer it takes to get done, the longer you stay employed!


In my experience, the project to break apart a monolith often happens without a clear definition of what problems we trying to solve by breaking apart the monolith. Which ends up creating a raft of new problems plus the old problems and a bunch of sticky left overs that are perpetually "going away soon".

And since you have no clear definition of why and what outcomes you expect, you also get massive scope creep in the middle of all this. Then you run into all the things no one planned for because there was no plan like how do we serve business functions like BI from our 30 new micro-service databases.


My theory: if you can easily swap local-procedure-calls with remote-procedure-calls, then it's a well-designed system, monolith or not.


the dream of transparently/automagically swapping local calls for RPC is decades long now

but you can't. Once you go RPC you hit the CAP theorem and need to deal with the reality of running a distributed system. Erlang is the only mainstream-ish production ready language I know of that attempts to solve this. And it is still very much caveat emptor. This is also where you might also bring up nondeterministic latency characteristics.


This is something I would agree with. If you can keep the boundaries in place but work in the same process space you've managed to avoid the mess often associated with Monoliths and the overhead associated with Micro services.


If you're dealing with any complex architecture, there will be portions of your service that will be hit more than others.

I'm currently building a backend in which will need real time capabilities and also standard restful http services.

Separating the real time service which will need a significant amount of performance more than the restful services will help me better scale.

Furthermore, the entire backend is written in python, because that is what I'm currently capable of at the moment, but in the future, migrating the real time service to Go will be heavily favorable - by separating it into its own service allows for that rewrite to happen at ease.

Now, there are many cases where building microservices are an overkill, but this isn't a one size fits all approach as the author would suggest, and I think we should all be tired of hearing a this or that type of article.


I've had great success with having a monolithic code base with multiple entry points. Each entry point is sort of a micro service (or just service), but it can access the same db as the other services (if it makes sense), use the same types, and crucially, it can easily be integration tested with the other entry points. With full debug support.

Such a "monolith" need not be the only one in the company. One per high level module or team works well.

I guess my point is, it doesn't have to be either giant monolith or tiny micro services in separate repos. There's everything in between as well.


Microservices are a great idea, if you don't religiously try to make everything a microservice.

I feel more inefficiencies were caused by forcing a microservice solution to every problem; than by big monoliths.


Conway's Law does not demand Microservices, it describes abstraction barriers. Forcing conflicts to be resolved in VCS as opposed to at runtime is a feature of monoliths, not a bug.


IMO you should never create a new service unless it serves an engineer reason instead of an organizational one. There's a lot of tools to help out monoliths, and a lot of ways to make it easier to shard the monolith as well.

Some services need it, like a backend intake platform of some sort that needs to have radically different performance characteristics than a user facing frontend. But for most services it just does not make a lot of sense to do this.


It also depends on what kind of monolith we're talking about.

We've found that "moduliths" (modular monoliths split into clearly defined bounded contexts with public APIs) work as well as microservices for scaling development: each team is responsible for their own module, there are very few conflicts, there's no spaghetti because we have architectural reviews whenever a module wishes to cross the "module barrier" and call into another module etc. (i.e. introducing a new dependency). You can spin up as many modulith instances as you wish as well.

The problem is that our modulith is written in PHP using a very popular enterprisey framework. PHP is based on the paradigm of spinning up a new process per request (php-fpm can recycle them but still), so every request ends up reinitializing the whole framework every time: its entire dependency injection tree. Every new module increases response times linearly, it doesn't scale. Another issue is that the single DB (common for monoliths) becomes the bottleneck, as all modules/contexts go through it.

Our PHP modulith is very costly in terms of runtime. A similar request into a microservice is usually 20-50 times faster because it's written in Go and manages its own DB. I think if our monolith was written in Go or Java from the very beginning we would have less impressive results after switching to microservices. Stuff rewritten from scratch is also usually faster than tons of old accumulated cruft.

Deployment/compilation is much faster now, the old monolith also used to have a lot of JS/CSS processing, PHP linters during build etc. so a tiny change to a module would trigger full recompilation of all modules running for 30-40 minutes. Each microservice is a separate deployment however, so a change to it only takes 1-2 minutes to deploy/release.

My point is that when people are talking about monoliths vs microservices they are often comparing dinosaurs written 10-15 years ago (PHP, old frameworks with bad design decisions, tons of accumulated spaghetti) to modern, more lightweight languages/tooling (for example, Go, k8s etc)

I think a "modern modulith" has its right to exist and is a viable competitor to microservices, provided they use more lightweight frameworks/tools, use paradigms such as modules and CQRS, and if somehow they allow smart, incremental deployments.


I want to know how many people who are using "microservices" are actually using "microservices" with separate databases for each service, separate teams and so on...?

If you asked my boss, we're using microservices. But really we're just taking common tasks and breaking them out to their own service. Now that's kinda like microservices, and it is very handy ... but it is not the full definition that I know of.


A decoupled SPA+BFFE is usually, incorrectly called a "microservices architecture".


These days, I try to not to prematurely optimize by setting up micro services from day 1. I find starting with a monolith, with an eye towards micro services, works well for most projects and as patterns and abstractions emerge, slowly design and provision micro services.


The pros have an RPC when a program that needs to timing on one SKU needs something from a program that needs a different SKU. Otherwise, you don’t do the RPC.

Micro service or monolith? Hmm, I’d like F1 car or tank. If it fucking matters, you know which one you need.


Discussed at the time:

Give Me Back My Monolith - https://news.ycombinator.com/item?id=19382765 - March 2019 (411 comments)


The lack of a stacktrace alone should be a hard blocker to microservice migration.

After that, the amount of cpu (i.e. dollars) and wall time wasted on encode-decode.

Anyone who does a microservice migration is not accounting for the above two costs.


Or maybe they are, and they still think its worth it? Stack traces are not an issue I've faced, good tracing helps a lot here. Encode-decode is a minor issue in comparison to the issues they are often trying to solve (which may not always be technical, especially at larger orgs)


Right tool for the job should have been the message all along.


engineers don't want to admit it, but microservices are a form of busywork. What used to be a few lines of native API code now require : RPC API boilerplate CRUD code, a build, a deployment, CI , dependency management etc.

Not to mention the additional complexity – now every "service" needs an LB.

You've converted a simple 1 person job into multiple days for many people.

microservices are a scam.


At a company I work at currently, they have microservices, but they moved the monolith into the side-car. Best of both worlds.


this is about scale. dont make things in a certain way because someone tells you to do it. collect metrics and use science to figure out what is best for you and your team today. engineers are building and using the things have to be reasonably happy with their tools or you get teenage behaviour.


To be fair, most micro service setups and tutorials have been overly complicated; however, let’s agree that distributed workloads and architecture are superior in a number of ways.

Generally, when people discuss going back to the “monolith” they just haven’t found the right distributed architecture.


Big one for me is logging, auditing, etc.. is always an afterthought. I am guilty of this as well, but really it should be the first thing you do, and do very well, in a way that is really convenient for everyone, before you think about anything distributed.


If you have a simple crud system, a monolith I’d likely preferable. If your business domain has a lot of complexity, which you can discover through Event Storming, then breaking up the monolith will provide clear development criteria and much simpler maintenance.


No

As someone who has spent the last 3 years of their life maintaining and scaling one this is madness.


Then boring technologies are no that bad after all, imho at the end it’s about facing changes quickly and if you have the tools and systems to do that then good !


This is bullshit article, sorry for bad words, but it's hard for me to keep using monothlich for any reason. Monothlich is the way to hell. It kills developer's productivity and team velocity by margin cost.


If you use the wrong tools, everything looks like a disaster.

Microservices on AWS with lambda+dynamo+auroa+api-gateway work very well. There's built in transparency. There is logging. You can set everything up with terraform.

Terraform makes the setup trivial. Compared to monoliths that I've seen that involve a lot of brittle manual steps, it's not even a competition. AWS Lambda with xray and other logging tools makes tracking down errors trivial. I have yet to see a monolith with anything comparable.

"Oh but I get a stack trace in my monolith" is false advertisement. How useful is that stack trace when the stack is corrupted? Or when memory is corrupted because one line in another part of the monolith has an error that slowly screwed up some datastructure in another part of the monolith? I'll take Lambdas that are all short, totally isolated, and easy to understand, any day. Debugging and understanding is much harder with a monolith.

And yes. To test, you need to bring up the entire working application. Just like you need to bring up the monolith. Oh? You mean, most people who test monolith don't bring them up, they just test some mocked version of some module in isolation? Well, they're probably testing the testing framework itself more than the monolith. With localstack you can bring up the entire AWS setup locally, automatically, run component and end to end tests. It's far more testable than a monolith. And far more obvious when an interaction is not tested.

Monoliths are dead. Stop writing them. And start learning modern tooling.


The way you use functions on Amazon sounds like it could be a monolithic development style to me. Independent functions is still a monolith if they use the same database (IMO). It's micro-services if you start to use different databases and have async state between them, which is orthogonal to what you bring up here.

What you bring up here is a discussion of execution environments for your code -- similar to a discussion of the best OS or programming language -- not whether the code is monolith or micro-service.

50 Amazon Lambda functions that all use the same database / data stores and understands the same data directly without going through APIs is definitely a monolith in my book.

PS: If you first allow for the possibility of a completely odd event like a corrupted stack trace (what are you programming in, pure C?) then to be fair I think you have to allow for the possibility of a bug in Lambda leaking state across invocations too.


If anything, it's great for the resume




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: