Hey now, it does have some specific advice! For example you should perform the "extensive reconfiguration" needed to generate code test coverage reports. Presumably since this article is about metrics for the C-Suite we can provide this valuable information to them so they can assess "the extent to which areas of code have been adequately tested". What truly useful and not catastrophically bone headed advice. Thank goodness we have tech savvy experienced contractors like McKinsey to whip our shabby tech org in to shape.
We are not in the audience, the audience is any organization that needs to manage and understand the output of software engineers, but does not have the necessary technical ability to do this.
If I’m understanding them correctly, their pitch is basically “pay us a lot of money to teach you some basic software engineering concepts, and don’t use dumb metrics like LoC produced per day.”
I think paying McKinsey prices to get this is a poor value, but it would probably improve most software engineering management.
> any organization that needs to manage and understand the output of software engineers, but does not have the necessary technical ability to do this
That's really key. A company actually interested in technology wouldn't be in the situation of having an engineering team they have no idea how to manage.
So the target here is the kind of company that probably ended there by throwing money around, and would like to throw money at this problem as well. McKinsey will be there whenever there's money to be thrown at a problem.
Given that it's McKinsey and given the current times theyre almost certainly looking to sell their "developer layoff consulting package" to provide nothing but the finest steak dinners, strippers and ass covering justifications for executives looking to cargo cult big tech.
Their highly sciency formula of number of commits divided by SLOC of the average pull request plus average response time on slack will scientifically segregate your company's developers by productivity.
I do hope we see a follow up on Hacker News - how to game the fuck out of McKinsey's top secret developer productivity formula.
> Well, I read it and still don't know how they measure software developer productivity.
"Assessing contributions by individuals to a team’s backlog (starting with data from backlog management tools such as Jira, and normalizing data using a proprietary algorithm to account for nuances)"
Using nonsense metrics like "data from [Jira]" apparently.
Though the other columns in the individual row - including examining developer satisfaction, retention, and daily interruptions - might be a slightly more meaningful, since they evaluate the work environment rather than trying to evaluate individual developers.
I would compare these metrics to draft rankings in professional sports. It is a beauty contest to identify a player/developer that looks the most like the player you admire.
What real teams and coaches do is say we have a specific way of doing things. Everyone has a role and a person in that role is expected to do X Y and Z. While a different role is expected to do U V and W.
This is what I object to the most to modern tech. It treats technology stacks and developers as interchangeable which is far from the truth. It is this factory mindset that I would argue is creating garbage software.
I was ready to dismiss all of this, but it's actually pretty interesting.
The big-picture view is given by Exhibit 1, and you can see it's really much more about analyzing all aspects of productivity (company-level, team-level, etc.) to identify bottlenecks around e.g. too much time spent testing over writing code, or senior developers stuck in endless cross-team alignment meetings which means management needs to identify a single person responsible for making final decisions quickly. So this is a document for engineering managers, not engineers.
But in terms of measuring actual individual developer productivity, if you read through all the verbiage it's really mostly just adding up the "planning poker" agile/sprint points that a developer delivers over the course of months. About which I have two main thoughts:
1) As long as planning poker is executed correctly -- i.e. always with unanimous consensus from engineers and without managers/PM's providing pressure -- this is actually probably going to be pretty decently objective, as a moving average. It's not terrible -- it's infinitely better than things like "lines of code". Although teams can also experience gradual "points drift" over the course of months, and points between teams can't be translated, so it can't be used company-wide or even year-to-year. But it will compare productivity between team members.
2) BUT, there's a huge risk involved because not all e.g. 5 point stories are the same, because of variance. Some are very straightforward where you know it's 5 points, while others are much more unknown, where the best guess is 5 points -- and it might wind up being 2 points but it also might wind up being 50 points. And so I could definitely see certain stories turning into "hot potatoes" nobody wants to touch because it's just too much of a risk -- it'll tank their average and they won't get promoted. And then everything just breaks down. Which means you either need post-sprint processes for "correcting" estimates and assigning a reasonable value for points done, or going deeper into "research" stories -- e.g. a 1-point pre-story to better nail down the actual size.
Ive always held the belief that story points were specific to a team, and a result of the work that specific team does and/or services, products and customers they support, which was a why comparing cross teams is a huge no no!
But story points themselves are actually guesses, and are never a true reflection of the work undertaken post mortem - and rarely if ever get reevaluated with 'correct values' mainly because it's really difficult to do that - it's another guess really.
I think any attempts to measure the accuracy of guesses on imaginary relative units, specific to a team, with deliberately obfuscated and difficult to reconcile units is futility by definition.
Instead companies have started to see the light of measurement using standard units of time and tracking the start to finish time including the states of tickets - Flow metrics.
Which is all fine, however you still have the original problem - unless you strictly compare very similar tasks perhaps done by similar engineers in terms of tenure, skill level and maturity, and maybe other factors, you still can't directly compare tasks across teams.
There still needs to be consideration of the make up of a team to get an understanding of skill level and experience.
Ive personally witnessed teams with graduates in them and people with 20 years experience completing very similar tasks and the difference is astonishing.
Step 1: We are going to estimate tasks in imaginary relative units
Step 2: We are now going to use these imaginary units to measure your velocity
Step 3: We are now going to compare your progress in imaginary units to other teams who use a different imaginary units, and complain your underperforming
Then you imaginary unit inflation and before you know everything is an XXXXXXXXXXXXXXXL t shirt size.
SV companies function as cults built on the same principles as pick-up artistry and other scams, but institutionalized and sanctified with language. Recruiters are professionally engaged in love-bombing tactics, and managers are professionally engaged in negging. It's all a game to get as much psychological leverage over people as possible. That's also why they like to get 'em young.
If you put a productivity metric in place, teams and individuals will optimise to increase the score of the metric. That’s what you want in general, but it is completely incompatible with planning poker. Why?
The same people are responsible for the estimate as the delivery. The obvious maximum solution for a team is to continually inflate all estimates to demonstrate improved velocity. At the individual level, it is to always take the most over-inflated story in order to demonstrate maximum individual contribution.
You get perverse outcomes with all metric based productivity measures (for example in retail sales having friends buy large volumes and return to another store). You can prevent this kind of thing to a degree through various sticks, but with “planning poker” it is the antithesis of the desired goals. Measure in a different way.
Hm, accidentally hit the submit button before the reply was fully done. Anyway, I think the story point completed, tickets done, pull requests submitted or even number of commits works somewhat well for measuring developer productivity, as long as nobody is aware of it. The least productive developers I've worked with did tend to have very few commits in comparison to more productive peers, but obviously that'd quickly change if they found out that it's a metric being watched, I see no reason for why story points wouldn't cause similar issues. Fight to get the "easy points", less collaboration etc.
As I kind of expected, a lot of general statements I mostly agree with, but very light on specifics and no concrete data at all. “276% better!” does not mean anything without knowing how they define better.
> leaders and developers alike need to move past the outdated notion that leaders “cannot” understand the intricacies of software engineering
Leaders absolutely can understand software engineering, if time and effort to learn is spent. McKinsey themselves say this. So their argument is 1) if you learn about software engineering, you’ll be more capable of evaluating the efforts of software engineers and 2) don’t use idiotic quantifiers of productivity like LoC per day.
It's easy to measure manager productivity. They create nothing and are the least productive people in an organization. They're poor decision makers, too, never learning from experience and relying on gut instinct, as if they're hunting in the woods. McKinsey has created the least value for society of any consultancy at it has catered to managers, who produce nothing.
Nope. Being an ex McK, first thing to do is check who wrote this, knowing that the senior figureheads are just for name, it was written by the lowly Analyst or EM.
Disagree. I’m exMck too. Whether senior person wrote it is irrelevant. Just ask whether it is good or bad advice. Lots of smart mid-level people (and some not so smart). And some senior people really good and some are not. My view is it is all about the particular person.
I’ve since been responsible for tech teams as leader of larger businesses.
So how do you measure productivity? I see no definition of "Developer Velocity" in this article or the linked ones, nor do I see an explanation of how it's measured, nor do I see any real attempt to prove its validity. They simply assert that they have a meaningful metric, which is begging the question.
>generative AI tools such as CopilotX and ChatGPT have the potential to enable developers to complete tasks up to two times faster
I'm surprised McKinsey is willing to endorse ChatGPT, which is one of their major competitors in the information-free drivel market.
Overly complicated and hard to understand bullshit, but I guess this is what they specialize in.
I really believe on the team in general can be held accountable as a whole. The team knows who isn’t performing and can ask for them to be removed. If the whole team is not meeting deadlines it should be disbanded.
Do companies want the trust of their employees or do they want productivity metrics? Because they can't have both. And they should certainly prefer one over the other if they want to succeed.
Did the person that created "Moneyball" understand baseball? Would you expect someone to be able to leverage those concepts without understanding what they mean or the insight in how to interpret what they say about the player?
This is a similar thing. If you have the ability to understand how to interpret the metrics, the metrics can be useful. If you can't interpret the metrics, you will use them poorly and the result will not be a better organization.
Remarkable! You could absolutely game the heck out of this. It's so complicated that no manager could figure it out, and that would reduce it to the TPS form level.
Good stuff for the next bullshit bingo template though, I didn't have "Developer Velocity Index Benchmark" yet.