Split Your Overwhelmed Teams

jillesvangurp · on Nov 15, 2022

It's good, sound advice. Big teams are inefficient teams. They are harder to manage. They waste more time in meetings. And they lack focus. The easiest way to fix broken teams is to remove people from them. You don't have to fire them. Just put them in other teams or create some new teams. This does wonders for cutting down on the stress levels, heated debates, and other wasted energy that plague overstretched teams with too many captains on the ship.

I would not even wait for problems to emerge. Just split big teams on principle. Minimum size of 3. Maximum size of 7. One eight person team then becomes two teams of 3,4,or 5 people. Make sure each team has a tech lead that knows what they are doing.

Don't over-staff teams with managers/minders/scrum masters/whatever label you slap on junior management. This causes a big problem: managers like to hoard people to inflate their importance. You deflate their team and you deflate their ego and they'll be forever whining to "fix it" with more "resources".

Simple solution: give them more than 1 team to manage and direct them to keep their teams small. Now they count teams instead of people. Or teams and people. And you can measure their success as a function of how well their teams are performing.

Then if those teams get overloaded, you have a conversation about which new teams are going to be needed and who is going to manage them. Any new team should be bootstrapped with mostly existing people: you move people around and promote them. New people start out in existing teams. This accomplishes two important things: people have a perspective of getting promoted sideways and new people learn the ropes in a small well functioning team rather than being dumped in an overstretched team.

tetromino_ · on Nov 15, 2022

> Minimum size of 3.

3 is risky. One person has a baby and goes on leave, or gets long Covid, or is poached by a FAANG. Until a substitute can be found, the team of 3 turns into a rather overwhelmed team of 2, each spending 50% time on call. And if either of them goes on vacation to recharge, the remaining one is left with an unbearable workload.

tpankaj · on Nov 15, 2022

My team is 3 right now and 33% time oncall is still pretty bad.

ecshafer · on Nov 15, 2022

The biggest source of relief for oncall for me is that even though my "team" is 8 people. We work on larger systems. We have ~5 teams working on a single code base, so around 30 people. Its not really a microservice, but its not really a monolith (since we have many other, and larger codebases). On call is much more bearable when its once every 3-4 months than once a month.

beberlei · on Nov 15, 2022

Genuine question, Is the software not reliable? How often do you get paged?

We have 33-50% on call and noone complains because not a lot is happening.

hn_go_brrrrr · on Nov 18, 2022

Being on call requires being pageable and able to respond. This seriously curtails your ability to live your life. For instance, you can't decide to go hiking over the weekend on a whim, because you couldn't handle an incident, were one to occur.

martin_a · on Nov 15, 2022

That's exactly the situation with my two teams now. Make teams of 4.

dmitriid · on Nov 15, 2022

I've worked in a team of 4 (and later, of 5), and it was amazing: everything could still be communicated quickly, but you didn't feel overwhelmed if a team member was inaccessible.

marcosdumay · on Nov 15, 2022

I imagine those restrictions on team size only apply to pure development tasks.

You certainly want more than one of those teams to be able to support each piece of your software. You also want to be able to move people quickly from a team to a close one, so from the training point of view you also want them to know both codebases.

brightball · on Nov 15, 2022

I saw a good chart that explains the issues as teams grow by showing the lines of communication between team members as the size of the team grows.

If you draw a dot for each member of a team and then a line from each dot to each other dot, you can see the "lines of communication".

With 3 members, there are 3 lines of communication (loc)

4 members, 6 loc

6 members, 15 loc

10 members, 45 loc

14 members, 91 loc

Communication breaks down as more people are involved.

bombcar · on Nov 15, 2022

This is the inverse of Metcalf's law:

https://en.wikipedia.org/wiki/Metcalfe%27s_law

As you get more and more lines of communication you end up having to break them and create a hierarchy.

alexpetralia · on Nov 15, 2022

Metcalfe's law succinctly explains why communication becomes more fraught as teams and organizations grow. Good managers minimize this quadratically rising cost.

mateo411 · on Nov 15, 2022

LOC = n(n - 1)/2

This is the same as the number of undirected edges in a N clique graph with no self edges.

nine_k · on Nov 15, 2022

+100.

When during a job interview they ask me what was the size of my team, I invariably say something between 2 and 5, and comment that a larger team cannot be efficient. At least, in my practice a larger team was never efficient as a team, and I helped split teams as they grew past this threshold.

froh · on Nov 15, 2022

have you worked in set ups like this? what was the tricky thing to get right?

hbrn · on Nov 15, 2022

> team decided to have a shared on-call rotation. They would cross-train. Each team makes a procedure book that covers any first-on-the-scene tasks for most alerts and issues.

Shared on-call is a section I strongly disagree with. Shared on-call erodes the very same boundaries that provided value in the first place. You could say it's a lesser and necessary evil, but you should at least be open about it.

Procedure book is a naive "sounds good, doesn't work" solution. Why don't you "solve" the on-call problem completely by outsourcing it to a $15/hour worker if you have these amazing playbooks?

This approach naturally leads to bloated playbooks. It's very similar to trying to solve all of the architectural issues by "writing good documentation". Never works.

On-call is typically a company-wide policy, and those are at odds with smaller independent teams.

The right answer is giving those teams more independence: they figure out how to solve their on-call themselves. Maybe playbooks is the answer for them. Or they have people who are still online after work hours and they don't mind being semi-oncall. Or maybe all their issues are not critical and 24/7 on-call is not necessary.

Smaller teams benefit from independence, don't ruin it via company-wide policies.

thom · on Nov 15, 2022

I think Team Topologies, which the article mentions, is a pretty foundational book. Not just that it's full of wisdom, but it's so useful to have the vocabulary to wrangle with these problems and their potential solutions. And as a bonus they even wrote a small followup about remote work. Both are highly recommended reading if you work in an organisation with any number of teams greater than 1.

langsoul-com · on Nov 15, 2022

Can confirm I felt the same when working in IT with far too many responsibilities. Just constantly felt dumb and like I knew nothing.

It's like the s shaped progression, but it constantly reset as I switched to the next thing.

Compared to just doing 1-2 things. Was always amazed at the cool new things I learnt and implemented.

YesThatTom2 · on Nov 15, 2022

Author here. Thanks for the positive feedback so far. AMA

probably_wrong · on Nov 15, 2022

So, did it work? The article doesn't mention whether the new teams are happier and/or more productive.

hallihax · on Nov 15, 2022

Very timely post for me - as this is exactly what I'm trying to achieve at my own place :P I've forwarded it to several people to help bolster my argument, so thanks!

YesThatTom2 · on Nov 15, 2022

Glad you found it useful! I highly recommend the Teams Topologies book cited in the article. Its a bit dry at first, but the solution leverages everything you learn in the earlier chapters in a jujitsu-like "use the enemy's power against itself" way. (Hope that makes sense)

solarbrew2 · on Nov 15, 2022

Passing along considerations as in practice I’ve found there is no free lunch with splitting teams:

- Quantify and measure the overwhelmedness and productivity of the teams.

- Time spent on where you’d like to get to within the next 1-3-5 years with the teams and the services they vend out is time well spent.

- Evangelize the re-organizational plan to stakeholders and leadership and hopefully they are onboard and can provide air cover while teams lose velocity during the change.

- There will be bumps along the way but coming back to being able to quantify and re-measure overwhelmedness and productivity of the teams will help define success or not.

Archelaos · on Nov 15, 2022

Isn't the ideal teams size just one? I do not mean that the one-person "teams" should not meet from time to time to agree on some common issues. But the ideal situation is that the work should be split in such a way that a single person has complete responsibility over an area and does not need to coordinate with someone else most of the time.

nine_k · on Nov 15, 2022

Sadly, no.

A team of one has zero coordination overhead.

A team of one also cannot have multiple experts gathering to solve a particular problem, where their combined set of expert knowledge matches the problem best. The one person must be a jack of all trades, and necessary a master of only a few, due to humans' limited lifespan.

(I'd say it's a bit like a multi-threaded program: very often one thread is not enough, but only a few threads can do varied but coordinated tasks. Massive multiprocessing only works when coordination between peer threads is not required, and usually they do the same embarrassingly parallelizable thing anyway.)

tetha · on Nov 15, 2022

Additionally, a team of one has no redundancy. You know, how Bob is responsible for maintaining the business critical databases, and now the databases are on fire and Bob is on a canoe tour through Canada without a phone. Oops.

For business critical things, you generally want 3 guys who can replace each other competently. Three, because one is none if the one guy gets sick, and two is also one if the other is on vacation and thus none.

toast0 · on Nov 15, 2022

> Three, because one is none if the one guy gets sick, and two is also one if the other is on vacation and thus none.

Proof by induction that it's not worth having anybody work on anything. Three is two if one is on jury duty, and two is none as shown.

tetha · on Nov 16, 2022

You jest, but we've had weeks during which - out of 6 people - one was on vacation, two were sick, and then two more had to call in child-sick because their respective day cares had to close due to positive covid tests. And suddenly you're last man standing between the outage gremlins and customer systems.

js8 · on Nov 15, 2022

While one person per team is ideal, two are better because they can discuss ideas and improve through arguments. If their views become too extreme, having a third person is better to moderate the views. Fourth person is needed so that one person wouldn't be a pariah holding minority view, and you want a fifth guy to make sure in 50/50 split, a decision gets made.

Five people need a lead that keeps track of everything, maintains the goal and resolves conflicts. Of course, two leads are always better than one because they can discuss ideas..

mejutoco · on Nov 15, 2022

Hegel‘s On teams could be the title of a work exploring these ideas :)

WJW · on Nov 15, 2022

One person teams are only best when it comes to coordination overhead. they have obvious deficiencies in terms of "bus factor". If the one person in the team leaves the company, gets ill, dies or takes a vacation, work will have to be transferred to someone else (in which case the coordination overhead returns) or the work has to stop (which is not always feasible).

A lot of the inefficiencies of bigger companies come from the redundant people needed to fill out the bus factor. Having a DBA on hand for emergencies might be overkill for a startup, but when you have revenue of 100 million per day depending on the database always being up it is much cheaper to have a spare person sitting around (even if they're not doing much most of the time).

varispeed · on Nov 15, 2022

One person "teams" but reviewing and approving each others' work and then switching every six months or so, works well.

From what I have seen single person can deliver more and quicker than 2-5 people and having to seek approval from their peers keeps quality in check too.

zaphar · on Nov 15, 2022

In that case you don't have a one person team. You have a distributed team with effective communication and well designed systems that a single individual execute with minimal coordination. Which is the ideal but calling them one person teams is a misnomer.

BurningFrog · on Nov 15, 2022

Only if that person knows everything and has every skill.

Also, they can't ever switch jobs.

benjamg · on Nov 15, 2022

I think you want at least two to cover people missing obvious solutions as well as provide a second pair of eyes when finding issues.

And because two means if there is a reason one person isn't around loses the benefits two people gives you then I'd argue three is the minimum size you should aim for.

the_other · on Nov 15, 2022

That sounds utterly awful. So lonely. So much pressure. Those poor teams.

techdmn · on Nov 15, 2022

Fantastic. One problem I've seen a few times, is the desire to consolidate teams to tackle the flavor-of-the-month project. You have an infrastructure team and an applications team, but somebody up the chain has staked their promotion on getting a big infrastructure project done. How many teams was that again?

sarchertech · on Nov 15, 2022

What do you do about on call responsibilities? The smaller the team the more time spent on call per person.

mcherm · on Nov 15, 2022

The article had a whole section about that -- they shared the on-call duties across both teams (but required a high-quality playbook for each team's systems).

treis · on Nov 15, 2022

Which is something that sounds fine but, IMHO, practically speaking it's not really going to work. Most of the time you're going to end up having to page for the systems your team doesn't own for any non-trivial issue.

edabobojr · on Nov 15, 2022

When your team owns an application you have no background knowledge in, and it uses a technology that you haven't touched since you attended a training 18 months ago; I would assume that your on-call response for any non-trivial issue would still be to page someone for help.

YesThatTom2 · on Nov 15, 2022

In this case the solution (cross-coverage) worked for that organization. However I agree that more often than not, it won't work.

I didn't have the word-count to go into this (that could be the topic of an entire book... or at least a chapter of The Practice of Cloud System Administration [https://the-cloud-book.com/]).

That said, my personal rule of thumb is: 6 is the minimum for an oncall roster; 5 if there is another team doing follow-the-sun coverage. YMMV of course.

Tom

hbrn · on Nov 15, 2022

But the on-call footprint is also smaller. Ideally up to a point where most of the problems don't require immediate response, so most teams won't need a 24/7 on-call.

I think the issue is that first people introduce 24/7 on-call, then they split into teams and this policy no longer makes sense, but noone has the balls to roll it back because it's optics are bad. But you should.

tommek4077 · on Nov 15, 2022

But we need Fullstack!!!1