> Just as a sports team wins or loses together, so too should the engineering team be treated as the fundamental unit of success.
A sports team has a play book, does your team? A sports team practices together, does your team? A sports team works as a unit, does your team?
Too many times I have see engineering teams as only a team on the org chart In reality they solve tickets as individuals with only a small interaction from pull requests. Otherwise they might as well not even know each other. They are a team not as in basketball or football, but like golf where once you get to the tee, it's you and only you to get the ball in the hole.
> A sports team has a play book, does your team? A sports team practices together, does your team? A sports team works as a unit, does your team?
This is why I like XP. Their teams really are teams like you say.
Though I think in many dev shops you can be team like. Someone might like refactoring and cleanup. Someone else is good at rapid prototyping. Another architecture. Sometimes a great dev is just the one who can take the unglamorous tickets and get them done at a sustainable pace. Or someone who is good at devops, teaching, or morale building. Sometimes just communication.
Everyone has different strengths. No one would ever say "who's the best (American) football player?" because you'd have to ask "who's the best kicker, tight-end, defensive lineman". They are all different roles.
To think that football would have the level of awareness that it cannot be measured as a single dimension makes it sad and laughable that people reduce programming skill down to one most of the time.
The military get this (usually) and place a lot of emphasis on training a team as a whole. In a tank, for example, the individual crew members train in their specific functional areas (e.g. commander, gunner, driver, loader) and when they have qualified in these, then go on to conduct team training as an integrated crew, which must be passed before the individuals can progress in their careers. In some armies, the individuals get a qualification that shows that they can work in a team context, and then can be flexed into actual operational crews as required. In others, the team is considered to be trained as a unit, and must be retrained when someone leaves / joins.
The military often go beyond team training in a way that very few other organizations do, and conduct 'collective training' that involves multiple teams. Collective training itself has multiple levels - e.g. at the lowest level, two or more tank crews working together in a tactical task (e.g. when four tanks encounter an enemy, which one should engage it?), gradually adding other functions (e.g. infantry, artillery, etc) so that all of the different tactical 'trades' have formal training in how to work together. At these higher levels, the feedback and qualifications are aimed at the units rather than the individual soldiers.
The military are also conscious of group dynamics, for example the 'storming, norming, forming' that occurs when team membership changes, and the effect of 'churn' on a team, as individuals join and leave.
Not only that but as every sports fan know the compensation in sports team vary tremendously, even for similar "roles". People in tech would be shocked (with reason) if that were the case in the IT world.
> but like golf where once you get to the tee, it's you and only you to get the ball in the hole
How many times a golf player has demonstrated an idea that upturned the whole field? But this is common place in engineering because it is essentially a creative task.
One thing that wasn't covered here is that productivity is about:
> the effectiveness of productive effort
and
> the state or quality of producing something
Those are quoted from the dictionary definition of productivity, and that definition in my opinion outlines a great insight. Productivity is about the "product" first and foremost.
One thing that's often missing is teams don't have ways to quantify how good the product itself is. Most teams will instead pivot to trying to measure their rate of change to the product. This doesn't mean you have become more productive, because software product is not like food production, more of it is not always better. Better software will instead be about it being more ingenious, more intuitive, more tailored to the problems of its users, more responsive, with fewer malfunctions, etc.
So to me, this whole thing of trying to measure "productivity" which disregards the product from the equation is incomplete. It's trying to measure developer efficiency at changing things, without care if the changes are for better or worse. This includes things like measuring velocity, or line of code, or ticket closed. But it also includes what this article proposes, number of meetings, time to complete code reviews, developer satisfaction with their tools, etc.
All these are trying to see how quickly can developer make changes without ever measuring if the change produces a better product, thus if it actually made the team more productive at effectively positively improving the product.
I'd like to hear about ways to measure software product state and quality. If we had those, you'd have an easy way to know how productive a team would be by seeing how quickly they can improve the product.
This is nearly always missed in my experience. A good product is "winning" for your engineering team, if you do not have a good product, you have lost. Would you have made a good product if your team spent less time in the locker room(meetings)? Maybe, and certainly more time on the field probably will increase your odds of winning, but it won't ensure it.
This always takes me to USE YOUR PRODUCT. If you don't use your product, you'll never know if it is good. My team builds internal tools for our support folks, we also use these tools every day to try and answer our own questions about internal problems. Are the tools we make getting better? Well, just consult our "How long does it take to answer X?" KPI, we have a set of known common problems; if our support folks are spending less time figuring out the answer to these gold standards, our product is improving, if they're spending longer, we've regressed and we need to change something.
I'll grant that making internal tools grants us a lot of gimmes, we're not tasked with taking advantage of users, and we can spend time training users intensively on new releases, but the underlying principle is the same; a chef will never know if a recipe is good without tasting it himself, and you'll never know if your team is successful if you don't know if the product you've created is good.
> This always takes me to USE YOUR PRODUCT. If you don't use your product, you'll never know if it is good.
This is a bit absolutist. Sure, use your product when it makes sense like in your case but many people here build things for user groups they don't belong to (constantly).
Besides: are you hitting all the same use cases your users are and at the same rate?
This advice always sounds nice but it's impossible to apply for lots of people. More generically applicable advice is that you should be acutely aware of your users real(!) experiences. This can be by using the product or by making sure the team sees and hears raw user feedback. Preferably combined with data that helps prioritize.
There's nothing like hearing the frustration in someone's voice when they are trying to accomplish something and _your product_ is holding them back. (well, except for experiencing it yourself and then we've circled back ;) )
> many people here build things for user groups they don't belong to (constantly).
A root of the problem. If you have no appreciation or desire to solve a problem well, you won't. There are very few pieces of software written for users that you can't figure out a way to use. If you can't use it, observe users, if you can't observe, ask users, if you can't ask, instrument their use. To your absolutist point, yes, it is not strictly an absolute, it is an encouragement to push as far along the empathy spectrum as you can.
Yes! There's a strong tendency toward measuring unvalidated proxies because they are easy or convenient to measure, without even constructing a hypothesis about how these proxies relate to more important things, much less testing it.
Measuring productivity has to start with what matters: do your end-users (these are not necessarily your customers) like what you do? And often it is hard to get this information. Just asking them will lead to biased responses, but there are ways to deal with that.
I happen to be in a business (let's call it food) where I can observe the end user using more of their money with our customers rather than their competitors when we do things right. That's a strong signal -- probably stronger than asking them -- so a very fortunate starting point. (Of course, there are still confounding terms like seasonality, economy etc to grapple with, but there are ways to deal with that too.)
Starting with that one measurement that matters, one can begin exploring proxies. Set up a hypothesis: "Velocity would be easier to measure and the number would be available faster. Does it correlate with our one good metric?" And then you run the experiment. It might take weeks or months to get back the end-user-happiness data that corresponds to this week's velocity, so these tests are expensive (but pay off many times over when you find good proxies.
In the end, you should be able to construct a somewhat sensible model of user happiness, and answer questions like, "if we hire another team member and therefore increase velocity by 3 %, how much happier will our users be? And what is that worth in sales?"
When you can convert everything to the same unit of measurement (dollars are an easily explained option, but log-dollars are a personal favourite of mine) you get intense clarity and alignment around priorities and decisions.
----
All of that said, development speed, as defined in the Accelerate study in particular, is one of those generally good things you pretty much unconditionally want. The reason is given in the study and expanded on further in Reinertsen's Principles of Product Development.
The reason speed is important is that successful product development is controlled by surprises. You will discover something tomorrow that will make you wish you had prioritised differently today, and being able to pivot quickly on surprises is how you both de-fang the biggest risks, but also how you throw yourself at opportunities before your competition even realises there is one.
> Engineers will become more focused and engaged, managers will become more effective and empathetic, and companies will build faster with higher quality. Engineering will rise to a whole new level.
A bit too much hyperbole for my taste, given the less than groundbreaking ideas.
I mostly agree - This doesn't seem to be adding any real visibility that velocity tracking (a la - agile) wouldn't give you already (not that I'm advocating for agile, mind you...).
Consider -
I have two teams, with the same staffing levels and the same general seniority. For this example, lets assume each is a team of 5, with 1 tech lead, 2 seniors, and 2 juniors.
Both teams have the same approximate meeting count, both work on the same stack with the same dev tools.
Team A consistently releases new features faster than team B. Why?
Because if the answer is "Find the blocker" aren't we right back at
> "your engineering leaders will simply justify failures, telling stories like "The customer didn't give us the right requirements" or "We were surprised by unexpected vacations.""
except with blockers this time?
Maybe Team A is actually just better than Team B.
Maybe Team B is actually working on a feature set that has more inherent complexity.
Maybe Team A releases faster but also has more incidents in prod.
Maybe Team B releases a larger changeset on average.
None of this is getting addressed or answered.
----
None of that is to say that measuring blockers isn't a useful idea, but it's certainly not some silver bullet.
Those blockers should be things that give you falsifiable stories, no?
So if someone says "it's because we're blocked on [slow delivery of designs from another team]" and you measure that specifically, and then improve it, and you notice the team's output hasn't changed, you've learned something.
I've certainly seen those reasons before, but haven't seen people turn them into specifically measured things versus "ok let's see if we can improve it" with often little or ineffective followup.
It helps to instrument the journey of your work from jira all the way to build, deployment, run and monitoring (observation).
From that you can get measurements on how long each stage takes and the duration of each transition.
From there you can compare team A and B. The transition times is where the human time cost usually sits.
Just getting the time when a Jira or feature is raised, to the time it is picked up to the time of the first commit to the time of the first test and final build does already give you valuable insight.
The points you raised towards the end can be answered if observability of your CI/CD pipeline is actually in place or at least a place to start a line of inquiry.
Naturally the blockers will be aggregated into some of the values but as you work through the journey, they will start clustering at certain stages and maybe highlight a significant problem that needs to be addressed.
There's a wealth of data being left on the table that can help inform management decisions.
Measuring engineering can be a slippery slope towards stack ranking performance -- which ultimately hurts performance and culture.
I think you have to measure with the intent to improve how your team works. If a manager can measure at the team level and open up visibility into the development process, they can hopefully find where things get frustrating (ex. waiting for someone to review a PR).
That said, there seem to be more mature tools out there than OKAY's beta. There's a discord server called dev interrupted that talks about this stuff a lot.
I'm currently working on a solution that will try to quantify software development health, and what I've learned from analyzing thousands of popular open source projects, is you don't want a single individual to stand out. You want work to be fairly distributed to reduce knowledge risk.
If you look at the busfactor section for the vscode and gitlab repository
You'll find they both have a large cluster of developers in zone 2. For developers to exist in zone 2, they have to have medium to high impact on the code that they worked on, but not clash with others. If you look at the vuejs-next repository
You can see it's actually a pretty fragile project, since Evan is responsible for pretty much everything.
Based on what I've observed by studying successful open source projects, you actually want to discourage "very high impact" employees, since they introduce knowledge risk.
Edit: The metrics that I'm showing is limited to the last 90 days for Typescript, Javascript and CSS code.
> Based on what I've observed by studying successful open source projects, you actually want to discourage "very high impact" employees, since they introduce knowledge risk.
This is a dangerous conclusion, especially for a business. If you're in pure maintenance mode, maybe... but otherwise... You want people who can pitch in anywhere, who can fix things rapidly, who can build new solutions quickly when required, and who know your business inside and out.
You just don't want knowledge siloed there, so you want to make sure other people are also on the path to being expert on the various areas.
Context obviously matters and I think it is important to understand what
I mean by "Very High" and "High" impact employees. High impact employees can still do everything by my definition, it's just that they aren't doing everything. Obviously some business/projects do not have the luxury of attracting lots of talented employees, so "Very High" impact employees are inevitable.
If you look at the busfactor stats in the deep dive section
The number of files that are being changed with only one author is 419 (or about 25% of all the files changed in the 90 days window). So 75% of the files changed in the 90 days window have two or more contributors, so I think those working on the vscode project aren't being siloed (based on my quick observations).
> Based on what I've observed by studying successful open source projects, you actually want to discourage "very high impact" employees, since they introduce knowledge risk.
If you're trying to sell this concept to developers, you might want to change "discourage" to "hire more than one"
Ideally you would like to hire more than one, but I think my use of the word "discourage", was not the right one. What I ultimately wanted to say was, you want a balanced team, where no individual development pattern sticks out.
A very high impact employee could easily be the result of having multiple poor employees, improper planning, and so forth.
What's missing from your assessment is how good the project itself is.
Maybe Evan is a liability, but he might also be the reason why VueJS is popular and valued in the first place.
It also seems pretty strange to me to count VSCode and GitLab in there, because those are worked on by companies with teams behind them that get paid and which will have constant churn.
If you took VueJS, and had a team at Microsoft take it over from Evan, it too would live on. So I don't know that your explanation for the "risk" here has anything to do with "very high impact" and more to do with a project being maintained by a community of backers, on people's free time, and a project maintained by a company that hires developers to work on it full time.
> So I don't know that your explanation for the "risk" here has anything to do with "very high impact"
When talking about risk, I mean immediate as opposed to long term. Obviously, if you pay people to take over vuejs, they can, provided they are qualified that is. But there is still the ramp up period and the risk of losing undocumented knowledge that Evan has, specifically with his understanding of "what doesn't work".
> It also seems pretty strange to me to count VSCode and GitLab in there, because those are worked on by companies with teams behind them that get paid and which will have constant churn.
The vscode and gitlab project are included because they provide good data points for very fast moving projects, that can be used to help understand closed source software development. Also the development pattern behind vuejs certainly exists in the closed source world as well.
I think my criticism is I'm not really seeing where in your data is risk or quality or success of the project quantified and correlated to this "very high impact" conclusion.
It seems you're inferring this "risk" from a qualitative perspective. Like hypothetically, we can imagine VueJS being more at risk of being abandoned because there's only one big maintainer. But VueJS hasn't been abandoned, and is doing well. So does the data support this hypothetical? And I'm also not sure how abandonment relates to productivity. If GitLab the company folds, that project will probably stop being developed and be abandoned, same as if Evan stops working on VueJS. Which one is more likely? No one knows, but companies can abandon an open source product probably just as much as community contributors. So I feel it's more about which one is more likely to be picked up after the current maintainers abandon it.
> It seems you're inferring this "risk" from a qualitative perspective
Yes this is correct, and I do want to make it clear that a project that is spearheaded by a single person like vuejs can still be a great product. And in no way am I trying to quantify "quality". I'm just saying, if you want to build a commercial/open source solution that uses vuejs vs react, angular, etc. this is the current state of its development/investment.
For some people/companies, this isn't an issue, because they feel the quality is worth the risk of having to take ownership of it themselves.
I guess the way that you should think about my statement is, "would you as a CTO for a company, be okay with having internal projects with development patterns like vuejs or would you try to create development patterns more like gitlab and vscode"
The best teams appear to be doing nothing. If as a general rule things are always in a fire drill then your software sucks (there are other reasons for this). We always measure how many things are fixed. New features added. Rarely how much time is spent baby sitting machines etc.
The best way to motivate devs to build truly robust systems is to let them go home at 2pm when things are finished and production process is running smooth.
However most organizations would pile more work onto devs if they finish early, so devs then compensate by taking more time on their current tasks. Why finish them early, when you'll be thrown another one right away.
The best way to motivate any employee is to give them a vested interest in the company (ownership, equity, profit share, etc).
Prod can be running as smoothly as you like - it doesn't matter if your company is having its lunch eaten by competitors adding or improving features faster than you.
You should be incentivizing employees to step back and look at the big picture every now and then. If customers are happy, the company is profitable, and prod is running smoothly? Go take some time off, head out early, etc.
Is revenue down? Company growth slowing, or worse, is the company shrinking? Time to work. Prod being red or green has little to do with it.
That kind of works in theory or for people up there in decision making ranks. For engineers down in the trenches it might not be such a motivating factor because they don't feel their individual contributions matter as much. It might motivate them to stay and slack off without getting fired until the company does an exit. So one might suggest giving devs equity would motivate them not to their maximum potential, but rather to the very minimum which will not get them fired before the cash day.
Also I'm sure we could come up with examples of people who got a good amount of equity just because they were there among the first employees and then not working at all because, well, their profit was already locked in by just waiting for others to do the hard work.
Finish what? Where do you draw the line of “work for today”? I don’t think motivated devs want to go home at 2pm.
Speaking as a dev, the best way to motivate me is to give me some head cracking challenge, trust (no reporting) and autonomy (don’t tell me how to do my work).
> I don’t think motivated devs want to go home at 2pm.
That's the beauty of it. Motivated engineers will use the spare time after 2pm to study and improve things. Or run some errands if they have to, without feeling their butt has to be glued to the chair until 6pm.
I've often solved many problems at work in my spare time or when taking a dump or something. It's pretty difficult to really get into deep thinking at the office because of 1. noise and 2. looking unproductive when you actually think hard.
Paradoxically office slackers bashing keyboard when chatting on facebook look more productive from a distance than somebody who actually is deeply thinking about system design and spinning in a chair or staring at a ceiling while doing it.
> I don’t think motivated devs want to go home at 2pm.
I can understand the attitude when one's in their twenties, has no significant other, kids, or other commitments.
Me? I just get tired. I'm very motivated and like what I do, but at some point, it's just silly to stay behind the screen. And sometimes, that point is even earlier than 2pm. That's ok. We aren't machines to be working like a Swiss watch.
I was most productive when my office let us take a few hours a week to exercise. I wouldn't use it if I had anything critical pending, but otherwise I would duck out an hour early several times a week to go get in a good long run (5-10km).
It felt like they had more respect for my time and well-being. When they cut it (with bullshit reasons [0]) "productivity" did not improve in the office. Instead, projects expanded to fill the time. Those 3 hours or so we got to use at the gym before became 3 hours spent to produce the exact same total work. There was zero motivation to utilize it "properly" by using it to get ahead on anything. Every project continued to hit the same deadlines they were hitting before.
[0] One reason I say it was BS, we were typically on or ahead of schedule on projects. Teams that were behind weren't making use of this time while they were behind, they weren't permitted to.
They might let us go home at 2pm but we can't stop thinking about work problems even in the weekends - that's when we get a nice juicy project to think about.
Honestly I don't think most engaged developers who care about their work need the carrot of going home early to develop more robust systems. Most developers who care about their work want to build robust systems, especially if they're responsible for handling outages.
The challenge is whether the organization prioritizes robust systems and devotes resources to making things more observable, reliable, and resilient. The team can want with all their heart to engineer fixes to common failure modes but if the decision makers are always pushing full steam ahead on new features it can be really difficult to improve the app.
Sadly business people want to fire you when you're seemingly doing nothing (as in coming to work on time, doing your job without stress and conflict, going home on time)... Even if all the business goals were met, they just can't stomach someone not working their asses off 24/7 for them.
That's actually insightful. Wish I'd known this early in my career. Often, I'll be doing a great job with no complains from peers or project deliveries but seemingly 'tolerated' and not liked by any management types as they had no influence, I just did what needed doing of my own accord. Or maybe I didn't need to know, ignorance was bliss.
Enough of that is funny 'cause it's true or sad but true. I work at a big co but not a full-on enterprise corp. I value my own time and work, more-or-less in alignment with broader initiatives. Learning new things and technical challenges makes the journey interesting, and it's fun/easy to get work done with most people. Sometimes I have varying degrees of communications (tech vs non-tech or grok/non-grok conceptual) issues but for the most part leadership has enough technical sense to make decent choices. So far, so good.
well, yeah. if you don't have air cover from a team or manager and don't appear to be engaged in shared suffering, it's no surprise people are going to ask questions, both from above and equal level. other labor will resent your non-suffering (how come /they/ get to leave on time?) and management will wonder whose reports get such a cushy setup and why.
The thing is, there might be no suffering one is "avoiding" - the business people are often sitting around drinking coffee and beer, cracking jokes, watching developers' backs.
You nailed it. The best engineers I have ever worked with never worked much. The bad engineers were always super busy trying (and failing) to fix the problems they put in the software in the first place.
But how do you get anyone up the food chain to care? I’ve never witnessed that anywhere I’ve worked. If you took things like team surveys, code review load, meeting load, etc., and tried to make a case that changes are needed, you’ll get laughed out of the room.
The only metrics that will matter are “did you get the stuff done that we wanted?” (where “stuff we wanted” is defined vaguely and the meaning shifts to suit product leaders ever changing political landscape), “did you get it done really fast?” (where “fast” is vaguely defined according to product political circumstances and never puts any weight on engineering estimates, staffing needs, or resource limits) and “did you get it done cheaply?” (where “cheap” is defined vaguely based on various internal politics and budget turf wars as well as larger company financials - and when the money is good and no one looks too hard at this, it allows sweeping other issues under the rug, no one cares if you burnt a quarter working on the wrong problems).
Assessing effectiveness, in principle, is purely a political concept that operates from the top down.
This has been true in every company I’ve worked for or knew a colleague or friend who worked there - from tiny “lean” startups to extreme cultures like Bridgewater to every mid-sized or large tech company.
> Each engineering team is unique, so its blockers will be specific. It's not so simple as "more maker time is better." If your engineering team is new or temporarily misaligned on key goals, more meeting time might be the answer. What never changes is the need for measurement and well-considered, deliberate decisions
And we came full circle back to measuring a teams productivity being an art form. Maybe you have a metric now: Hours spent in meetings per week, but nobody knows how much is too much and how little is too little. How do you measure the impact of more meeting on productivity? Or the impact of less meeting? If your measure of productivity is "time in meeting"? This is a circular dependency.
The article is correct to point out that in case of no process measurement, politics, not results, will start to dominate.
Measuring process inputs will favor Sisyphean work. Appearing to be working, hours punched in and butts-in-seats will be more valued than results. Many companies are stuck in here.
Measuring proxies of outputs such as lines of code, or number of tickets closed, as the article mentions, only leads to people gaming the system.
Measuring the actual value delivered is obviously the best thing to do. However it is often difficult to even define value and the amount of it created. Which is always a problem especially for inhouse projects or in the absence of direct contact with the market.
I like what the article suggests - measuring process flow instead of measuring inputs or outputs. Can't say exactly why I like it but it seems that engineer types are naturally motivated to perform and learn, so giving them enough space is a good way to get working systems as a side effect.
I like it too, but I think it misses something crucial: how often is the team unblocking others? This, naturally, is at odds with minimizing interruptions.
On one extreme there is no shield around engineers and they experience constant whiplash with "code oracling", shifting priorities, obscure trivialities, and other things that can drive people insane and can prevent meaningful work from being completed.
On the other hand, sticking perfectly to "we're committed to this sprint and unless its on fire you wait in line" and much of your business becomes unnecessarily rigid and painful. That, IMHO, is much worse.
Finding a balance is crucial, and to me thats just as interesting of a datapoint.
People want to know a specific behavior of a system. How often does X happen? What kind of things can trigger Y? Maybe there is documentation, but even if there is nobody trusts it. If folks could just easily and see they would have.
The only solution is to start reading the source code. Perhaps you even wrote it, once upon a time. Nonetheless, all you can do is scan the code, built a mental model, try to see if its testable outside of production, and procure an answer.
In my experience legacy systems, especially when laden with tech debt, require an awful lot of this if you're doing anything more than just keeping the lights on.
Can you explain why you think this will happen to undersized teams more often?
If it's an indication that something is wrong with the team, I think that's still helpful. For instance, if the team does not have the right experience and are often blocked waiting for someone else's answer to a question, it might make sense to work on team composition. It's not about blaming people or teams, but about improving flow.
Often you literally don't have hiring headcount allocated to your team. Sadly the solution is to make the team implode to force the company to fund it properly.
Just have seen it quite a lot. QA, Ops, etc, are often viewed as a cost center and thus underfunded. It's common to then challenge the "productivity" versus looking at whether the team is undersized or funded.
> It's common to then challenge the "productivity" versus looking at whether the team is undersized or funded.
This exactly. The typical management "solution" is to identify and fire "poor performers" (thereby creating fear, mistrust, and an incentive to look for other jobs) while ordering everyone else to "work harder" for longer hours (with the implied threat that you will lose your job if you don't.)
If your team is a constant blocker because you're undersized, you should be given more headcount, or some responsibilities should be shifted around. That's then necessary to upper management in order to remove the blocker.
(Assuming an org that's making a genuine attempt to improve, versus just a cynical blame game. But in the latter case... you're screwed sooner or later anyway.)
The more time goes by, the more I value what I learned from Peopleware (DeMarco, Lister) 20 or so years ago. It seems so clear, precise and uncluttered compared to modern efforts.
Peopleware is a classic. It's genius lies in the fact that it's able to shine a light on the important aspects of productivity and work life that fall between the cracks in our quest to 'measure' productivity and be 'efficient' (Taylorism).
The big issue it has, IMO, (which other DeMarco books like 'Slack' share) is that it does not provide actual case studies that would convince a typical executive / manager working in a high pressure / competitive environment to change their methods.
> If your team is full of competent, driven engineers, removing their blockers is the fastest way to enable forward movement.
That's a HUGE if right there. In my experience, the most problematic teams I've ever worked with were simply full of incompetent and/or unmotivated engineers, and for whatever reason it was impossible to replace them.
I think the challenge with this is that blockers only get removed once and super awesome teams preemptively remove blockers before they block.
So while it’s good to remove obstacles and maybe helpful to measure if obstacles exist that’s bad, it doesn’t mean much when no obstacles exist and none have been removed in a long time. Maybe they are super awesome and chugging along from all those removed obstacles. Maybe they are super sucky because they don’t see things as obstacles.
Engineering activities must make sense as economic activities.
Engineering activities productivity can be measured the same way the productivity can be measured for any economic activities.
See "Internal Market Economics: Practical Resource-Governance Processes Based on Principles We All Believe In",
Book by N. Dean Meyer.
The posted article came so close to the idea of the right measurements, I was intrigued and was like "yes, say it!", but the author didn't say what I expected, instead they pointed at quantifying calendars and commit logs. What a bizzare self-contradiction.
So what is that true way to measure productivity you alude to? By the way, engineering can be an economic activity, but it doesn't have to be. Could be an art as well, for example.
The book I referred to answers the "but engineering team doesn't have revenue".
Engineering can be art, sure, and if you like, we can even discount the idea that art creation can be seen as economic activity. But is there a need to measure productivity then?
Cost of running it includes engineering salaries, so isn't a good measure to include when you measure productivity of engineers. A productive engineer would be worth a higher salary, but if you include the cost of that higher salary reducing his value would that also reduce the salary you want to pay? Doesn't make sense.
Then from the internal market economics PoV, here's how it looks:
Engineering: costs (salaries + infra) are covered by revenue (services rendered to exec team).
Executive team: costs (execs salaries, expenses to pay engineers and sales for their services, accumulating interest to be returned to investors if any) are NOT covered by revenue (which comes from customers).
I find this decomposition incredibly neat and enabling solving the right problems.
The key question is how do you quantify earnings/savings? In the example:
Future efforts to build widgets will take less time to build -> How to quantify ?
Vulnerability exposure that could have lead to disaster -> How to quantify ?
What you are saying is nice and clean on paper but simply impossible in practice.
It's an investment problem for those who hold the money in the organization.
Say, we have an organisation consisting of exec team and engineering team.
Engineering team has a bright idea that a certain undertaking will reduce time to make one widget. They bring this case to exec team. "It will reduce the cost of a widget from X1 to X2, the project will take T time and cost C."
If exec team sees this investment interesting and possible, they make that investment. The worth of the project is simply its cost C. Exec team willingly paid, and if they got what they wanted for it, they should be happy with "productivity". (Similarly to how you don't bemoan "productivity" of the baker you buy your bread from.)
Yes. Again, properly estimating X1, X2, T and C is so difficult in most cases that this strategy is not really applicable (notable example: technical debt).
Calibrated estimates with huge uncertainty ranges are still useless, even if calibrated.
"One month of refactoring this component can save us between 500k and 50M in the next five years with 90% probability" - this is not very useful for the decision maker, yet it is quite difficult for a domain expert to narrow the range.
It is possible, and in some cases practical, to spend more resources to obtain more information which narrows the range. That's what Applied Information Economics methodology preaches.
Ok, but some tasks are less valuable and take more time to do, but are completely necessary. How do you value something like that? What if the "value" is actually less than the cost of the task (but it's still absolutely necessary)?
And if a task has an internal price on it, but the development time slips, to the point it exceeds the price, do you stop development?
I think this method is interesting, but I have a feeling that all you're doing is shifting the complexity around. The underlying complexity is still there (it's impossible to accurately estimate a development task and impossible to measure developer productivity). I guess that's a good thing, though - now the whole company has to deal with the complexity rather than just the dev team ;)
> What if the "value" is actually less than the cost of the task (but it's still absolutely necessary)?
This is a false premise. But it's surprising how many people seem to hold it, to their peril.
> I have a feeling that all you're doing is shifting the complexity around. The underlying complexity is still there
That's right, but do you agree this approach moves it to a place where it makes more sense, where it informs good decisions and is manageable?
> it's impossible to accurately estimate a development task and impossible to measure developer productivity
It's not impossible, but it's not something we as a society or an industry have a common fine grasp on. On this topic, I like best the books by Doug Hubbard: "How to measure anything" and "The failure of risk management: what is it and how to fix it".
It just requires yet another unusual mindset: probabilistic thinking, in addition to the above-established value-based thinking. You have to use a technique called calibrated probability assessment. We started practicing this at my workplace, and it seems to be working as intended, but we're not well calibrated yet.
> That's right, but do you agree this approach moves it to a place where it makes more sense, where it informs good decisions and is manageable?
I have no idea. I don't know enough about the approach to assess it. I like the idea of it, but I'd be concerned about the method of valuation. There's a lot of hand-waving in corporate accounting and the value of anything in a company is mostly a made-up number (enforced occasionally by accounting standards). I'd be concerned that all it does is externalise (and therefore politicise) the kind of decisions that a good CTO normally makes internally.
> It's not impossible
I should have refined my statement. It's impossible to give a definitive estimate for software development. You can definitely give a probablistic one (which I think is what you're talking about), but that's usually unacceptable to the rest of the executive team. Educating the rest of the executive team to think in probabilities sounds harder than just telling them "no you can't have a definitive deadline" ;)
> Engineering activities productivity can be measured the same way the productivity can be measured for any economic activities.
The economy is basically a population based evolutionary process. Its nature is open ended - things that seem important now might have been seen as being useless or stupid a decade or two ago, like neural nets in 1985 and personal computing in 1950's. So we can't tell ahead of time which will become important, but the exploration process and extreme diversity are the main gain. It's all about getting to those stepping stones that we don't even know we will need. Inspired engineering is eventually rewarded with economic success, just like biological evolution.
So what's your point? Mind you, we're discussing an article on engineering productivity in business.
Nothing is happening without funding.
Business can't run producing things which are not valuable in the moment, it will go bankrupt.
Government can sponsor such things as it sees fit, best if guided by a clear hypothesis that funding it makes the outcomes probabilistically better than not funding it.
When nobody is paying you but you're tinkering with some stuff, then it's you who is sponsoring it with your time and other resources.
That's exactly my point - following the gradient of money you won't get to the global optimum because it's just a short sighted signal. You need to invest into crazy ideas to get anywhere. Unknown unknowns don't get discovered in a systematic way and are not profitable step by step.
If your name is Christopher Columbus and it's 1492, your project gets rejected for being implausible, what do you do?
The idea of identifying and removing a bottleneck isn't new. I don't know about randomly removing "blockers", it's generally widely accepted that only the bottleneck makes sense improving, everything else is wasted work as you'll just accumulate work in progress. People sometimes rebel a lot against this because "my code is a creative process and that's for factory lines" but I've never found myself worse off when improving a bottleneck, though you do have to fight the "but everything counts" crowd.
I like that the article kind of gives up metricing and just says each team should figure it out. This is kind of similar to the no metrics he proposes.
If I had to choose, I’d choose no metric over lines of code or number of tickets, etc.
I think the issue is that when orgs get big enough they have lots of teams and having some comparability is useful to find lessons learned to spread among teams, etc. I don’t know of any metric that is truly objective for figuring out high productivity teams or individuals just by using it.
I think there are some “vital signs” that you want projects to have, but don’t want to fixate on the actual value. Like you don’t care if someone’s pulse is 70 or 80 or 90, but you want to make sure pulse is checked.
I think having reviews in git is helpful and not having any is something to look into.
I think having contributions from other teams is a good sign, although absence isn’t necessarily bad. I think the positive is by showing how others are finding and asking questions or reviewing or contributing material so that’s probably reuse.
I think encouraging information sharing through lunch and learn presentations is good, but is delicate to avoid gaming from people just “making the circuit.”
Having an automated CI/CD is a good sign and if code is making it to prod without one, it requires looking into.
Theoretically a healthy team will have all these signs, but you could have an awesome team with low numbers and a terrible team with high numbers. So these metrics wouldn’t be useful for detecting productivity and comparing across teams, but would be good for just finding big, lurking problems.
I comically work in an org where there are whole teams not using source control, so the “only I can measure myself, give me more money” is a very real challenge.
A lot of the questions listed heavily overlap with what I would expect to be covered in team retrospectives.
A team reflecting on progress at regular intervals will naturally bring up processes, meetings, tools, etc, and a manager or team lead can easily add these questions into the mix for reflection and/or discussion as part of this process.
The key distinction from this seems to be hidden in the line "Finally, turn all these questions into metrics" - but the article could definitely do a better job of highlighting the differences between their "solution" and retrospectives (as well as more general good management practices like talking to your team).
There are some interesting ideas in there that seem tied to their product (https://www.okayhq.com), but the article doesn't really progress far enough past the high level ideas to be practical and rather a lot is left as an exercise for the reader!
> Instead of measuring some approximation of engineering output, software teams should measure actual, observable metrics that directly correlate to effectiveness.
I prefer to measure value created and partially tie it to compensation, directly and indirectly.
For a dev team for a trading platform, we had an index of income created by product, approved by team and stakeholders, that affected total compensation paid to team.
This is a specific example. The general principle is let tech teams make decisions and compensate them for value created, at least partially.
Otherwise, when you don't want to share your money, "measurement of productivity / effectiveness" comes in. Because when you measure money, people ask, "why am I not getting more when I make you more"? But if you're not measuring money, why do a business?
Because measuring money is hard and it's not easy to apportion credit.
Person A has an interesting-but-not-exceptional idea and gets Person B to fund it. People C, D, and E code it.
For most interesting-but-not-exceptional ideas you could replace all of these people with others.
If you actually ran this an experiment in a Monte Carlo-but-real kind of a way, you'd find some collaborations would do well, some would do badly, some would fail completely, a few might explode (in a good way).
How do you quantify the value of the relative contributions?
I have to disagree with the idea that sales is somehow becoming a scientific process due to CRM. There are openly gut feeling fields like how a call seemed to go, how "hot" an opportunity is, etc.
Most of the ideas predate software as well. People had remarkably complex paper based sales systems.
I’m not in sales but I remember my firm having sales pipeline meetings 20 years ago where sales people would talk about stuff that happened. It was all manual and anecdotal. So the sales pipeline was only in someone’s head or maybe a spreadsheet they hacked out once a quarter or whatever.
I think the difference today that the author referenced is that with Salesforce stuff like number of leads, contacts, conversion, revenue and profit per conversion, etc is all captured and reportable in sort of real-time. There’s still lots of gut inputs like how the meeting went or having people estimate probability of closure, etc. But now it’s more systematic.
I don’t know if it’s more successful, but it seems like I get more sales calls and repeat contact attempts.
I don't actually see a productivity measure in all this, though.
If there are more blockers, is the team doing better (because they're covering ground faster and finding new blockers faster). Or worse (because they're being blocked more)?
If I have to report to the CEO on the Dev team productivity, do I tell them how many blockers we removed?
If you only measure blockers as an index for productivity then two teams with the same number of blockers should always have the same productivity, regardless how the people are? I cannot fathom how wrong this “measurement” can be.
Yes, reducing blockers is important but that is not be the only thing the world cares about.
One thing I really like about this approach is that it creates strong incentives for engineers to write maintainable code. Code that is expedient but brittle or hard to understand will be noted as blocking progress for future changes, while robust code can be promoted as blocker-free.
Here is an idea: measure added value. Realize it's produced at the bottom and consumed at the top of the hierarchy. Remove yourself from the organization as a pure consumer of added value.
IMO a manager's productivity is measured based on how well their team performs(or how well they present their team's productivity) and this article provides an approach on how to do so.
So the more blockers a manager removes, the better the look to their manager.
It’s turtles all the way up.
I actually think that one’s compensation is correlated to how ambiguous the job duties are and how hard it is to measure success, given employment.
Stuff that’s easy to measure gets commoditized. Stuff that’s hard to measure, but still important is hard to staff, creates risk and is easy to mitigate with money.
or...
Just use the metrics outlined in the accelerate book, which have been fairly well studied and have quite a lot of quantifiable evidence to back them up.
E.g.
Deployment frequency
Lead time for changes
MTTR
Change failure rate
A quick google reveals a fair amount of existing material :
The process of abstracting away people into units of productivity is common approach to Software Engineering management.
It has failed every time.
When the latest management fad fails to live up to expectations, new terminology are created to hide the failed management fad. New terminology simply replaces the old terminology and they can now sell new and improved management concepts again and again.
Software Engineering is extremely difficult. That's why successful Software Engineering companies are highly valued as unicorns. The companies have temporarily captured the right type of people tackling right type of problems and delivered right type of Software Engineering solutions. The process can not be replicated. That's why it's so valuable and worth Trillions of dollars.
There's no other Microsoft. There's no other Apple. There's no other Google.
These are unique companies with unique products that provide value to customers. It took years and decades of Software Engineering man hours to iterate until they delivered the valuable software solutions.
Lot of these Software Engineering management fads are trying to capture something that does not exist.
Productive Software Engineers are exceptional. Sooner Software Engineers understand that paradigm, the better they will be able to value themselves and be more effective.
Productive software engineers aren't "exceptional". If you focus on removing obstacles, you can make most software engineers productive.
The point is learning from those who've succeeded before you, instead of claiming it's just magical chemistry happening. Or worse, starting again from first principles.
No, that won't produce product-market fit, nor does it open up once in a lifetime opportunities, but it allows teams to execute well. The belief that productive software engineers are unicorns, however, and need to be valued above all else? That's what destroys companies.
There are reasons why terms like 10x engineer, Rockstar engineer, Ninja engineers are thrown about in the industry.
Google paid Anthony Levandowski $120 Million to lead their autonomous car project, before he went to Uber and the legal troubles. Uber was paying him more than Google.
Facebook paid $16 Billion for What's App, which only had around 50 engineers at the time. Facebook was buying the IP and the productive software engineering members of the team.
There are countless other examples.
No amount of management process can create productive software engineers. That's why companies recruit productive software engineers from other companies. That's why companies buy other companies that has productive software engineers with proven products.
Productive Software Engineers are exceptional. Their work are worth Trillions of dollars and generates Billions of dollars in revenue every year.
'Evaluations of an economy's productivity performance are made using a measure of real gross domestic product (GDP), which represents the constant dollar income (labour income plus profits) that an economy generates through domestic production, with the volume or constant dollar indices being calculated from the prices'
'DP=(GDP/Hours)*(Hours), (1)
where Hours is the total number of worker-hours.
Chart 1 depicts changes in each of these components over time. For the entire 1961-to-2012 period, labour productivity advanced at a 1.9% annual average, accounting for slightly more than half of the increase in GDP growth. The rest is attributed to hours, which increased at 1.5% per year on average.
Aggregate GDP measures the returns to both labour and capital. Distributional concerns lead to questions about whether the share going to labour increases over time and, in particular, how productivity growth is related to real income.'
The best engineers don’t do much work. They are excellent at delivering the software needed to solve a problem fast and without problems. The worst engineers are always super busy. Fighting fires, starting new fires, rewriting code that doesn’t require rewriting, introducing new shiny tech that will break things in new busy work creating ways etc. etc.
A sports team has a play book, does your team? A sports team practices together, does your team? A sports team works as a unit, does your team?
Too many times I have see engineering teams as only a team on the org chart In reality they solve tickets as individuals with only a small interaction from pull requests. Otherwise they might as well not even know each other. They are a team not as in basketball or football, but like golf where once you get to the tee, it's you and only you to get the ball in the hole.