I did some trivial math. Redis is composed of 100k lines of code, I wrote at least 70k of that in 10 years. I never work more than 5 days per week and I take 1 month of vacations every year, so assuming I work 22 days every month for 11 months:
70000/(22*11*10) = ~29 LOC / day
Which is not too far from 10. There are days where I write 300-500 LOC, but I guess that a lot of work went into rewriting stuff and fixing bugs, so I rewrote the same lines again and again over the course of years, but yet I think that this should be taken into account, so the Mythical Man Month book is indeed quite accurate.
However this math is a bit off because in the course of such 10 years I wrote quite a number of side projects, but still, max ~50 LOC / day.
From the sounds of it that also includes a decent amount of greenfield work.
I don't have hard figures to easily consult but I'd guess that I'm at about your average in total, but then on the days when I'm refactoring/debugging existing stuff, honestly it could be like 3 lines a day, or 5, or just planning/sketching something out.
It's like the old mechanic trope. It's not hard to replace a bolt, what's hard is knowing which bolt to replace and where.
This "LOC as a proxy for productivity" metric seems so much harder to measure in a useful way on brownfield work.
On my most productive (in my own estimation) brownfield days in recent memory, the codebase would generally shrink by several hundred LOC.
I also find that a huge factor in my code production rate on brownfield projects might not have much to do with me, because it's factors like, "Is the code well-documented, easy to understand, and backed by tests that make the intended behavior clear? Or do I have to start by burning days or weeks on wrangling with Chesterton's Fence?"
And, on the other side of it, when is documenting and cleaning my own code to guard some future maintainer from that situation vital, and when am I burning a day of my own time to save someone else only an hour in expectation? All I know for sure in that situation is that, if my manager is assiduously counting LOC, ticket close rate, anything like that, then game theory demands that I should never bother to spend an extra hour on making it more maintainable if I expect that the cost of that decision will be born by one of my teammates. The 10X rockstar developer at a previous team of mine taught me that lesson in a rather brutal manner.
> I should never bother to spend an extra hour on making it more maintainable if I expect that the cost of that decision will be born by one of my teammates.
If you want to be a team lead, though, or even just have people follow your lead, I find that not only do you want to worry about these costs, but you need to talk openly about them, and be seen addressing it. Most devs follow the ones they trust, no matter what title they have.
On all the projects where we tried to build people up instead of get shit done, we were consistently getting more shit done at the two year mark, if not sooner. Any idiot can ship a version 1.0.0, but it takes some talent (and luck) to ship version 2.3.0
From what I’ve seen, Postgres followed a similar model, and if you look at the performance benchmarks over time, it has progressively narrowed the gap with each major release. That kind of momentum is something worth sacrificing for.
I am about to rewrite something (different cloud, blabla) and if I do a good job I think the program’s LOC will go down by about 25%, the unit testing LOC will go up by maybe 50%, and maybe 50% of each will be the same old logic while the other half will be new logic.
I have no idea how I would try to count that if I wanted to measure “productivity.”
LOC is a terrible measure for productivity, but I like to think about it more as a measure of capacity. LOC/day is useful as an upper bound in the same way pages per day is an upper bound for authors. Stephen King is notorious for being one of the most prolific writers, and he can’t produce more than 8 publishable pages a day (top hit on google suggests his average may be close to 6). Knowing that the number of LOC/day is so low on average, can really help keep estimates honest, and remind us how truly difficult what we do actually is.
Isaac Asimov, another prolific author, averaged something like 2800 words per day[1] over the most productive period of his career, which works out to about 9 pages, so that seems like a good estimate.
1: Google is giving a much higher number but they all seem to go back to the same estimate which is more hand-wavy than a printed source I found in college when researching it. Sorry I don't have it to source. The other number from that source was 1800 published words per day if you start from his first published book, which is absurd, since new authors tend to be much less prolific.
Interesting, I'd never thought to look much at other authors output on this topic. There was a post the other day talking about Jules Verne's corpus, and I now wonder what his output was like.
I've another thought on the topic as well, which is that I have a friend who hired a lot of former newspaper journalists circa 08 when there were mass layoffs in that industry. He said they were able to most consistently churn out content for his book over other writers, but that it was significantly lower quality, and needed a lot more rework and polish. For him the tradeoff was worth it, and he organized his time and resources around this increased quantity. But this adds to the idea that you really can't look at pages/day (or LOC/day) on an individual basis either, since it's the whole pipeline that matters. And if you really want to improve total output, you need to look at variables outside of individual contributors. Things like code reviews and QA can perhaps greatly increase total output.
LOC is a poor way to measure progress, but it's not a bad sanity check on time estimates for a proposed project. Most experienced programmers have some points of reference where they know the approximate LOC count, and can make a rough functionality/complexity analogy to a proposal. If the proposal is less time than the team could possibly generate a comparable amount of tested code at 10-100 lines/day (wherever your team is in that range), then you should probably revisit the estimate.
Apply a few of these kinds of comparisons using different metrics, and you may be able to improve your estimates.
"Measuring programming progress by lines of code is like measuring aircraft building progress by weight."
I'm no expert in aircraft, but I'm guessing that in both cases the relationship between progress and the metric in question is logarithmic: The first bits to be put place represent the bulk of the (weight|LOC), but only a relatively small percentage of overall time and effort.
Has progress on that been fairly continuous with the project growing steadily (almost) nonstop? Or was a big portion of that the groundwork just to get it to a point of being usable in the beginning?
With projects I work on, I'll often write a few thousands lines of foundation in a couple weeks, then I'm adding a line here and there as needed. The first 1000 lines are always easy. The next 10 can take days.
I think this went quite constant. The main change is that in the first years I could work a ton of stuff 1 month, zero the next month, and now is instead more smoothed evenly.
And if you have to bounce around across different codebases, you obviously go a lot slower.
If I'm pounding out the same boilerplate code I've written for every greenfield app, I can go at a phenomenal speed. But if I'm put into code I'm not very familiar with, 98% of my effort is understanding the existing code and 2% of it is making that 1-10 LOC change.
Over time, over multiple iterations on multiple versions on various platforms (maybe excluding tests), this seems about right.
I remember when I was prototyping an OpenGL thing on SunOS or a Renderman on same (maybe Solaris?), I was working ALL of the time and cranking out LOTS of code. Then refactor-refactor, fix technical debt, slowly add features without breaking, more automated tests, more platforms, and then, tada! My effective rate of coding (measured by LoC) was depressingly low.
I guess that I'm happy that it appears that I'm effective and relatively efficient, but I'm not the LoC-cranking-out machine that I thought I was. Sobering.
What annoys me most about these metrics is that some days zero lines are written. Anything up to a month without results to show.
Where, then, does all this time go? Sometimes it's reading existing code. Sometimes it's learning about a new algorithm by reading blogs and papers. Sometimes it's developing test programs to iron out a bug or test out some new code.
There used to be one chap in the office that got all the hard problems - the seriously hard problems. Some of this was figuring out why USB couldn't transition from low-speed mode to high-speed mode reliably (USB is quite hard to probe due to frequency), or figuring out why the application crashed one in a million boots.
Some of our most valuable developers committed the least amount of code, but saved our arses more times than I can count.
That’s fundamentally a lack of respect for the engineering aspect of software systems and a sort of self-loathing embraced by people in the field.
Many software roles require what I would call Home Depot skill levels. People at Home Depot take semi-finished materials in a kit and fix their toilet, without understanding how it works.
Likewise, some journeyman skilled developer and “code” a sign in page with an API without understanding the engineering process around OAuth.
The problem is many business people don’t understand anything beyond the Home Depot kit... they see stuff on the shelf and don’t understand that at some level that engineering side of the work needs to be done to create something novel. Reinforcing that notion are vendors hawking products.
As someone with Home Depot skills, I 100% agree. I really wish that there was a common distinction. I am not the right person to solve a novel or complex engineering problem. I am the right person to build a product that won't require solving a novel or complex engineering problem. I probably shouldn't be paid like the former, nor should I have to have the qualifications of the former to land a job for the latter.
I think there’s a further subdivision of “hard” which is the fundamental research stuff that pushes the boundaries of CS. Then there’s business problem stuff that’s hard because of scale, surface area and general messiness of the real world. Although the IC salary peaks might not be as high, there is more money overall in the latter, and it’s not as much about raw intellect as it is about moving up and down the abstraction layers, thinking things through and translating technical trade offs to laymen.
I think this an interesting analogy. Taking it further - what's great about the Home Depot level skills is that they are often sufficient for me to be able to do routine basic maintenance. This is in-part because many things have become simpler and designed to be easily/cheaply replaced. I think the same could be said for software. That's generally a good thing and was an intentional movement in the industry
That said, I probably don't want to be building a house from scratch with my level of skills and should hire someone with specialized knowledge. Likewise it's also an important skill to know when you will be in over your head and when you need to hire someone to get a job done correctly
I'm another mostly "Home Depot" coder and can glue all kinds of things together without really having to dig deeper. Maybe I could go deeper if I needed to, but that's not what my job demands or requests of me, and what they need is the Home Depot code that bolts all their existing systems together.
I think those of us in roles like this can actually bang out a lot more LOC than somebody working on lower level problems, because we aren't solving hard problems, we're using basic data structures and tossing them between (usually/hopefully) well documented interfaces. If that's the case, LOC is just about the worst metric you could imagine.
Code is both asset and liability; the asset is the feature set, while the liability has an interest payment in the form of maintenance.
The way you put it, you're optimizing for only one side of the books. The fact is that the value in a company is not in minimal clean code; it's in a recurring revenue stream, and ideally profits. Provide the most value with code which has low interest payments. Everything else being equal, smaller code has lower interest payments, but everything else isn't always equal. And depending on cash flow and market opportunity, maximizing value and to hell with minimal clean code - throwing money & devs at the problem - can make sense.
The distinction here is between code thats clear and concise or code that hacky and confusingly compact. Few people would recommend try and pack a 4-5 line function into a super complex and confusing one liner, but it is reasonable to put a 10k line class into a 20 line function. It's on us, as the developers, to make that tradeoff.
I think the spirit of the comment you replied to was closer to the "clear and concise" methodology rather than the "as short as is humanly possible" methodology.
Only the few people that don’t know anything about the map function or generator expressions and prefer messy imperative code where off by one errors are a given, if you want my opinion..
At one job (porting a colossal legacy UI to Windows), I deleted thousands of LOC every day for months. Coworkers called me "the decoder." 25 years later I'm probably still net negative.
This issue can easily be fixed by switching from delta LOC to the size of the git diff (number of lines changed). The big problem with this strategy is the huge difference between 10 lines of carefully engineered algorithm code and 10,000 lines of blah API calls and boilerplate. I can write API calls and boilerplate as fast as I can type.
Not the GP but... Reduce a cross cutting concern from a system into an aspect and you get this easily.
I once worked on a product and identified an ability to eliminate 100K lines of poorly written, inconsistent tracing code into a robust ~250 line file using AspectJ. Management threw a sh-t fit and thought the risk was untenable.
AOP tools that effectively rewrite the app have incredible amounts of leverage. That can work to your benefit but it's also an enormous footgun if you get your aim wrong. It leverages up both cleverness and stupidity.
The risk of the new one or the risk of keeping the old 100K lines? Half serious question since I would estimate the risk of the latter to be much larger.
To me it sounds like he was introducing a hard dependency on AspectJ, which is as much a risk as any other dependency. I am guessing here, bit it is a scenario where a hissy fit from management has at least some justification.
It is just as much of a liability, a priori neither more nor less. It needs to be evaluated like any other potential new dependency.
Plus, AspectJ is something that you have to be careful with. It injects code at the start or end of methods that can do arbitraty things and the method source code doesn't indicate that this is happening. So it has a great potential for code obfuscation.
Sort of unrelated rant. Maybe it’s because I’m not as well versed in Java idioms as I am with C# idioms, but using code that implements AOP using AspectJ seems much more obtuse than what I’ve done in C# just looking at the examples.
In C#, with the various frameworks - including on the API level with ASP.Net - you can use attributes to decorate the classes/methods with your aspects and it’s pretty easy to see what it’s doing.
You get the runtime binding basically by just using dependency injection as you always do.
C# dev here as well, but from a Java background. When I first moved to C# from Java one of the best AOP usages was transaction management. Database transaction management. You could write all of the code, whether it was dependent upon the db or not, and then decorate the methods with a transaction attribute. This decoration contained all the logic to get a db connection, begin a transaction, become part of an existing one, or create a new isolated one. Any unhandled exception caused the final unwinding to rollback any work that had been done in that transaction. So many try/catch/finally's avoided and so much boilerplate code.
I have yet to find any equivalent to this .NET world. Especially of you're using EF. Either you use ADO and have your try/catch/finally with manual transaction management, or you have the EF context which is just one big blob you hope succeeds at the end.
Yes, this is exactly the type of boilerplate I am talking about. All those usings and try/catch blocks which add needless code. It is possible to compose all of this into an aspect which then decorates your methods. Maybe I'm not being clear, so here is what I mean. Say you have a method that does some work, but calls some other thing to do some auditing. The auditing is nice, but it failing shouldn't halt the world.
The TransactionScope is handled in the aspect. Commit/Rollback is all handled there is well. There are not usings or exception handlings within your methods unless you want to handle those specifically.
There should only be one using/try catch block on the highest level. All of the methods being called within that using block should just throw errors.
You could put the logic in attributes but I don’t consider a transaction a cross cutting concern. It would create a spooky action at a distance situation.
It's obviously highly dependent on the domain in which you work, but I would consider what you're saying to be a "business transaction" more than a "database transaction". If there is a 1:1 between the two then your way works. I tend to have situations where one business transaction is multiple database transactions. And the business transaction can succeed even if some of the underlying database transactions were to fail.
It was a C# file for an API that wrapped thousands of reports with a function call for each report, I moved to a design which was a single function for all reports.
I think what had happened is somebody had designed the file and everybody else followed suit patching stuff on - the entire codebase for that app was well below average. they had front end devs who didn’t know any JavaScript. In 2016. I lasted 6 months before I nope.png’d the fuck out.
It’s still not the worst application I’ve ever worked on though
Not the parent, but I once inherited a bunch of tools that used the same tracing function. Akin to dtrace. 6 people wrote 6 tools over different domains all with their own post-processing, filtering, formatting, etc...
It was a support nightmare, so we built a common library and collapsed the code base by 70%. Each tool was probably in the -8k eslocs range. Thankfully it wasnt c++.
I once had a senior manager who insisted that developers made at least one commit a day (an internal GitHub like tool gamified this: number of lines committed since last month / top committers in the team etc), and those that didn't, had to up their game.
It was frustrating to say the least as this was not the only metric. There were a handful and, frankly, many made a mockery of it by doing just as much or less work than before but achieving or even surpassing the said metrics.
At my last Java job, whenever there was a day I didn't have much or any code to commit (e.g. I was in the middle of going through compsci papers about a nontrivial algorithm I intended to implement), I would open up the IntelliJ's "code inspections" tab. It provided me with a never-ending stream of quick and small fixes to make that not only amounted to a commit, but also occasionally fixed an actual (if not likely) bug.
It's quite normal for it helpdesk teams doing all the users password reset tickets to "outperform" the teams fixing server issues based on "tickets closed" metrics
I know I'm against popular opinion to the extreme here, but I'm not altogether against that (the commit every day bit, not the LOC competition bit).
Few reasons:
1. I find devs (including me) tend to do too few commits instead of too many. Smaller, tighter commits are better, but it's really tempting to try and do an entire feature in one commit.
2. If someone has spent a couple of days working on something without committing it, I'd be concerned that they're stuck, or spinning wheels. I'd check in on them. Not in a bad "you're not working hard enough" way, but in a "do you need help?" way.
3. If someone often spent more than a day without writing anything they could commit, I'd check in on them. Again, like 2. above, not in a "dammit work harder!" way, but maybe it's an indication that they're getting handed the really hard problems, or that they need some more training, or that they're going through some personal stuff, or something.
but measuring lines in each commit is pointless/futile, as is measuring number of commits in a day.
I can sort of see the rationale for this, but it's been a long time since I've worked anywhere where this was feasible even 50% of the time; nobody wants fragments of a feature in trunk, and the codebases tend to be inflexible. "Commit to branch", sure; branches are cheap.
> If someone has spent a couple of days working on something without committing it, I'd be concerned that they're stuck, or spinning wheels. I'd check in on them.
This is more reasonable. Working without review leads to worse fiascoes the longer it goes.
If you do work in progress (wip) branches where you're not concerned with breaking the build and then rewrite the history to not break the build after (counts as new commits) this becomes more viable.
Also encourages testing work to be formally coded and just done interactively. If a test is good enough to compile, commit it as work in progress even if it has a bug and/or incomplete if you've reached a good point you'll like to build upon. Make a tweak, compiles, commit it.
If you've spent most of your day writing a test, but it's complicated and will require even more work before you'd even try to compile and run, commit it WIP before going home.
The only place I've ever been at that had any sort of lines of code measurement also tracked lines removed, This was on a points system and in fact removal of code was considered worth more points as long as it didn't break build (there was also some other stuff about if it was rolled back or the lines you had written were changed in a short time that points were removed)
There was also some points for removing TODOs etc.
It worked pretty good because there was only a team of 4-5 people on any product at any time so someone just removing TODOs and not fixing the issues pointed out by the TODOs would have been caught.
Fundamentally, that’s why SLOC can be useful as an estimating metric, but terrible as a control metric. SLOC, FP and so on all have their limitations, but they demonstrate that most of the effort-time in a project doesn’t go into putting hands on keyboard. Conversely, trying to monitor developer productivity with SLOC simply reintroduces the conceptual error that the estimation effort attempts to prevent.
Goodhart's law - "When a measure becomes a target, it ceases to be a good measure."
I used to work for a company that bills their customers for dev hours spent. The software they put together worked fabulously well - in the production of billable dev hours.
Whenever I spend a week without progress I feel like dying.
In college I was used to be able to churn immense amount of code. Even if most of it was useless, I'm not well adjusted for long productive-less periods.
How did your manager react to these times ? no remarks ? nagging ? trusting ?
College coding is very misleading. It's usually clean-sheet, by yourself, with idealized requirements and few dependencies. It's rarely robust or tested extensively, and its lifespan is short.
It's also a hell of a lot of fun... which is why it's not really what you get paid for. What you get paid for is the long, tedious slog of the real world: maintaining existing business logic, teasing out user requirements in a domain you don't really understand, dealing with other developers who have different preferences and skill levels, doing variations of the same thing instead of exploring new domains and technologies. You spend a lot of days in meetings that should have been emails.
It's not all drudgery, and it's both more fun and better paid than 99% of the jobs in the world, but it's not picking the wondrous low-hanging fruit that you did in college.
This is also why I prefer to use projects inspired by real projects at work to test out new programming languages or technology. It's really easy to make something look good by just ignoring the messy realities of the real world. It's a lot harder if you're doing an experimental rewrite of a system with real world requirements attached to it.
Progress is measured in more things than code written. Define progress using the right metric, i.e. stuff learned, and the feeling of progress and your motivation can be preserved.
For me, it is really a top down approach. I can work on goals that take years to accomplish. But the key is to break them down into smaller and smaller bits until you have work items that show progress on a small enough scale to be easily observable. And part of this is sometimes research, so I can't measure myself in terms of code or features. But each task usually has a way to define progress.
Good point. I do agree with you vastly. But in my few work experience it was never discussed nor shown. Which led .. well lack of leadership. And ultimately deep anxiety.
Do most jobs have a team chat to talk about it before going into actual work ?
Yes. In 'Agile' development parlance, you would have an estimation session, where the group will look at the units of work to be assigned, and determine how easy or hard they might turn out to be.
At their most objective best, everyone on a given development team can gain some insight into the work of others, and how hard it might be.
These session can also be a great way to share knowledge, as developers with different levels of experience and specialisations collectively examine high level goals.
In that, you all have a fair opportunity to either share opinions on the best way to achieve a task, or just merely learn something from someone else about tools or techniques you're unfamiliar with.
And for insightful managers, it's also a great opportunity to communicate high level aims and objective, and occasionally, also break those objectives down, transparently, and explore them.
At worse, estimation sessions can be used as a tool to bully dissenting or inquisitive coders.
Even in it's least positive guise, collective estimation sessions are still valuable. At the very least, you have the opportunity to agree, as a group, on what is, and what isn't going to take 10 or 1000 odd SLOC. You'll also have a better idea (if only slightly better in some cases), of how long that n SLOC will take to write.
It's surprising how few managers value objective estimation. But the problem I suppose, is what it does to the working relationship the rest of the company has with their software development team.
Basically, to allow a team of developers such an 'indulgence', every worker in a given business, including senior managers, have to accept that all interactions with the development team, are led by the development team.
That takes a lot of trust, and you'll rarely find that level of trust outside of a startup.
All metrics can be horrible. To take an obvious example, we used to repeatedly see the temperature on one cold day being quoted as proof that global warming wasn't happening. So clearly the temperature must be a horrible metric for global warming, right?
It is of course the main metric for global warming, but it can be used badly or very well. Just like Lines of Code, it's hard to even get the measurement right. Do you measure it in the sun, or the shade? Do you measure it in a city which is relevant to the where most people feel the effects, or in the country so you get a repeatable environment. Similarly does LOC include comments and blank lines, what about patches - how do you count them? In terms of LOC per day, so you measure a single person who is churning out the code, or the entire team including the designers and documenters, and do you include the time after the project is completed doing support because of bugs?
I don't think you can blame the "temperature metric" for the bad ways it's measured or used. And I don't think you can blame Lines Of Code all of it's bad outcomes either.
Not quite sure what point the author is trying to make? He agrees that lines of code are a bad measure of productivity because, yet claims that the average he computes can help him predict future development times. Then he explicitly points out that different parts of the codebase required different amounts of work, apparently unrelated to their code line count, yet does not relate this to the previously mentioned points. Also, what is up with the random comment about code coverage at the end? That doesnt fit in with the rest of the article either...
I think they're stuck with the same problem as every dev manager. LoC are pointless/futile/etc, but they're really easy to measure and they're got to measure something right?
Like measuring productivity by how many hours people spend at their desks. Utterly pointless, but really easy so it becomes the default measure.
Trying to explain to professional managers that there is no foolproof way of measuring developer productivity is a really hard conversation that I've had more than once. I'm assuming the OP's target market is exactly these people, so I don't really blame them for succumbing to the pressure.
Even the assertion of that knowing the lines of code per day helps with estimation seems puzzling. How do you know how many lines the finished product will eventually have in advance?
My current task is to implement a feature and I wrote about 50 lines during the last 2-3 weeks. I'm not slacking off, just examining the codebase and planning. I did collect about ~25-50 lines of notes per day though, so it's not like I'm keeping everything in my head. To be fair, the codebase is new to me, so I don't expect to keep this tempo for very long.
If I had been pressured to start writing code immediately it would have been more difficult to comprehend the codebase, thus slower, or even worse I would have introduced anti-features or bugs.
I once opened up a government code base (in C) that lead with this: "void * * * * x;". It took a while to learn that code. Taking time is fine and good managers will support you, so long as you have something to explain what your time is spent on.
I'm consulting a client right now and their C++ codebase is full of gems of this caliber. I eviscerate every single PR, and tell them what's wrong and why, and then their CTO just merges them into master without changing anything. All the while they pay me for it. Why? Fuck me if I know. They hired me to provide advice. I provide advice, and it's pretty expensive. I _know for a fact_ they will regret merging this shit. I can explain why, in a way a toddler would understand. And yet they still do it. Smh.
You might be going into conflict/criticism mode, which would make them unresponsive. Being right isn't enough to get people to change. I don't know at all if this is the case, just mentioning it, because you seem to be frustrated by the situation.
I actually asked them if this want me doing this. They said they did. This is only a minor part of what I'm doing for them - my work is mostly on the AI side.
'This code was already worked on and we paid for it... it can't surely be that bad'
If you deliver something broken it's better than delivering nothing... you'd be surprised how often we get 'We'll improve it later' compared to 'You didn't deliver anything?'
But it wouldn't even take much effort to fix. It's not like all of it has to be rewritten or anything. And once my contract ends, they're kinda fucked - they don't have, and aren't going to be able to hire, anyone with my C++ and multithreading/concurrency expertise. If I were in their shoes, I'd take my advice like word of God while it's available, and try to learn to write better C++.
Are your reviews more like suggestions of how to do things better, or more like "this sux and here's why"? If any of your criticisms are related to something that a linter or static analyzer could catch, have you tried to get that tooling established in the project?
I am, of course, diplomatic. It's more along the lines of "This is not thread safe, if threads A and B call this function at roughly the same time, you will end up dereferencing a null pointer. Here's how to avoid this: <code example>". Tooling would catch some of this, if they were running it, which they aren't. But it would not catch bad design.
That's a pretty big range - but I'd say frontend code tends to be faster / take up more lines. Center a button, give it a colour, make it responsive and you've probably reached your 30 lines of HTML/CSS :P
but the fine tuning, bugs and corrections, reported issues and other aspects should be part of the metric too. The days were there i barely progress because some alignment is not working and coordination with the design team is taking the time makes it part of the average as well
10 lines per developer day - this is so outdated it should be disregarded entirely. Not much is gained from writing articles around this "10 lines a day" assertion unless you're writing something about computing history.
Maybe this was true when people were writing operating systems in assembly language - which is the time and context in which The Mythical Man Month was written.
Lines of code per day really is a pretty meaningless measure but having said that there is at least some truth to it, in that any developer who is writing a typical 21st century application and getting only 10 lines of code per day written, should really examine if they are in the right job.
People misunderstand it. It means writing code, documentation, testing, bug fixing etc.
I'm not sure it is that far off over several years. If you look at Google, and say there are 52 * 5 days * 10 lines of code, 2600 lines of code per developer. Extrapolate that to 20 years times how many developers and list on code that is currently used, would it be that far out?
People over-estimate the sustainability of short term coding lines per day, versus multi-year coding.
Multi-team development is radically different to a two-person project, especially when it's in production with lots of money depending on it.
You can't rewrite much without the risk of breaking things, so you need a lot more testing. There's a lot of value in the code, so there's more to leverage to add functionality, but the other side is there's more to learn and analyze to efficiently leverage. And when there's lots more teams working on the code, there's more of it and it changes faster than you can expect to fully comprehend; you're continually analyzing and learning.
When I'm off on my own, spiking a new service or library, I can churn out 10k to 20k lines a month; it's very easy when it's greenfield, when there's no team coordination overhead, when you don't need to refactor other people's stuff when you redesign, you don't need to go through full code review cycle, the whole thing fits inside your skull etc. But that doesn't last forever, and it's not business as usual for feature development.
Fixing a gnarly bug might take several days of investigation and end up with a 1 line fix; and delivering that fix might make the difference in avoiding 6 or 7 figures worth of revenue churn. Does that mean it's 10000x less productive?
It sounds like you wrote 20-30 lines per day part-time over 10 years, which is within an order of magnitude of 10 lines per day for the average developer. Sounds about right to me?
I also think that's correct, regarding the order of magnitude, familiarity with the topic... the 10 lines a day (order of magnitude) could perhaps make more sense from a perspective of cognitive overload.
2 people is a pretty small project and by the looks of it it probably didn't involved nearly as much coordination as most large business apps. I also imagine you didn't have to do much coordination with project managers, qa, business analysts, architects, other devs, etc.
When I worked on a big project (several million lines of code) I routinely would code less than 10 lines per day.
It doesn't sound like this was done within the context of a corporation, which slows things down a lot. Also, it has 424 open issues right now--is it really bug fixed?
(Also, not a fan of back-jumping 'goto's or putting executables in git repos. And s/supressing/suppressing/g)
You seem like a fine programmer, but I don't think this invalidates MMM.
Recommend me an architecture concept or book. I’m trying to learn how to keep my “velocity” up over time
Edit: you made Sumatra?? I’ve been using that for a decade, thank you so much for creating it! I put it on all my family’s machines, it’s the fastest PDF reader I’ve found
>>>People misunderstand it. It means writing code, documentation, testing, bug fixing etc.
Are you sure about that?
I don't think so.
I have not read it for years, but my vague recollection of The Mythical Man Month is that 10 lines a day as a full time professional programmer is a reasonable expectation if you are writing IBM mainframe operating systems in assembly language in the 1970s.
It not meant to be saying that hey a programmer is busy with lots of other things like testing documentation etc.
Brooks actually meant it, and it was true, at the time, with 1970s hardware, with 1970s development tools, with 1970s collaboration, 1970s compilers/assemblers, 1970s source control, with 1970s level of understanding of computers and software, with 1970s waterfall style project management. And, critically, writing your 10 lines as part of a (relatively speaking) gigantic project for the time. Back then projects did not come much bigger than writing an operating system.
Can you imagine trying to get your bit written of some gigantic project where probably no-one has any idea what the fuck is going on, where the source code is, what version is what, who is working on what and how its all meant to tie together. The miracle is that they could get any operating system at all written. Many, many operating system projects failed entirely in the 1970's and 1980's.
I'm happy to be proven wrong, cause as I say I haven't read that book for a long long time.
10 lines written is low but context also really matters. If you are building a greenfield project then averaging 10 loc a day is not going to work out. However if you are the guy on your team who fixes all the nastiest bugs or cleans up legacy code then you will be writing a fraction of that.
My last company had a code base with X LOC and Y engineers, and it took Z years to write. It was medical software, so it was very heavily tested. It worked out to about 14-20 LOC per day per engineer, although for the first few years the team was smaller so let’s say 28-40 during those years (maybe even doubled once or twice temporarily when the company was founded). The slowness later on made up for the speed early on
In terms of lines of changes total in git merges, multiply it by 10 or more.
edit: updated loc numbers to include some file types that I forgot about
I think today we still have days / weeks like this. But averaging it out over a year you'd probably get more than 10 lines of _code_ per day yeah. Not counting comments.
Say I'm working on something complex - I'll be happy with 10 working lines of code per day. But if I'm making a webpage, I'd damn well hope to get more than 10 lines of HTML/CSS out. Well, HTML at least.
Anyone else feel sad when you remove a bunch of code? All those man-hours it took to write that code, and now I'm deleting it all. I believe it's called sunk cost fallacy. I often wish I had a time machine so that I could go back in time and say, hey, this will all be deleted a year from now, go with the other solution instead.
Nope. One of my first experiences with removing many lines of code was in college. It was a compilers project (we had to hand-write it, no YACC or anything). I had written about 10k lines of code in a flurry of activity, and discovered some really gnarly bugs.
I spent a day with just a pencil and paper, considering each detail of the algorithms and came up with several key insights which reduced the whole thing to about 1k lines of code. The reduction was a combination of C macros (which I wouldn't use today, but I'd use higher-order functions to accomplish the same thing now) and just smart generic code (no special handling of an add operator versus a multiplication operator, they were both binary operators; differentiating the output for each case happened in one place at the end).
That was when I found out I liked deleting code. I'll happily reduce a code base by 90% if the result is clearer and easier to maintain or extend.
I experience the opposite reaction. I'm elated! Especially when a conditional can be removed/refactored. That's 1/2 as many tests. 1/2 as many possibilities for something to go wrong. As long as the functionality remains the same (or simpler) I love to delete code. Especially my own.
I don't get sad when that code is deleted. I get sad when the people deleting it bash it as being "bad code", especially when they are new hires who don't know any of the context behind why that code is the way it is. Nobody wants to write "bad code", but every code base has some. Thankfully, with time and experience, developers usually mature from "who wrote this trash?" to "this code served its purpose, but we can make it better".
If its taking 100 lines to do something and I can rewrite in 10, then that's a lot less places for bugs to hide. That's less than one screen of code for the next person to read / comprehend. In most cases its a win (though I will admit that sometimes comprehension is easier with less concise code).
If the code solved the original problem, as understood at the time it was written, then it was not written in vain even if changing requirements or improved understanding eventually make it redundant.
Nope. I liken it to exploratory surgery. Towards the beginning, I'm pulling everything apart, writing scaffolding to try things out, discovering along the way what's needed and what's not.
At some point, I close the patient, discarding a lot of that. At the end of the day, ideally it looks pretty minimal--just what's needed and no more.
Somewhat, but I feel better understanding that time/code was spent understanding the problem better, and with that knowledge I'm now able to make something cleaner/simpler/more concise.
My company has been experimenting with doing our feature estimations in LOC changed instead of points (1, 2, 3, 5). The general idea being that point estimation can vary between engineers based on ability, but LOC changed should be similar among engineers. This is supposed to make it easier to answer management's favorite question of "How long is this gonna take?". The answer is calculated using a team's historical average LOC/hr rate.
It remains to be seen if our estimates are any better.
Over the last 2 months I’ve managed about 3k new loc and 90 classes, so that’s about 60ish lines per work day. I don’t feel like I was that productive though and spent a lot of time refactoring. Eg, last 1k lines barely added any features
What do you do to keep up a fast pace in a big project without throwing quality out? They say TDD increases your speed overall, according to a few case studies I found (15% longer to code, but 40% less bugs, so faster finish times overall etc)
I don’t feel like I was that productive though and spent a lot of time refactoring. Eg, last 1k lines barely added any features
"being productive === adding features" is a very negative way to think about development, and exactly the sort of mindset that leads to projects that grind to a halt under the weight of tech debt. Good software comes from all the parts of the process, including maintenance of the code base to reduce drag on features you'll write in the future. When you write requirements, do refactoring, test things, write documentation, etc you are being productive. Your future self will thank you for the effort you put in to the non-code work now.
This is the engineering point of view, but in a business productivity is judged from a business point of view.
The goal of software development is to deliver chargeable value to customers.
When you refactor code you do not deliver value to customers and since you have spent time and resources to do that your overall productivity from a business point of view has in fact dropped.
When you refactor code you do not deliver value to customers..
You do though, by reducing the future cost of delivering features. That has tremendous value. Find a company that sees good engineering as a long term investment rather than a short term way of extracting money from customers and you'll enjoy software development a lot more.
Again that isn't customer facing value. Users do not care about the costs of your business, I cannot understand how so many developers don't seem to understand the basic premise of b2c relationships. At best it's the proposition of future benefits and those benefits are all to the company not the customer.
It absolutely is customer facing value. If shipping this feature 2 weeks later means you can ship the next 10 features in 6 months, and the alternative is shipping this feature today but with so much tech debt that you ship the next 10 features in 12 months, then that is value the customer will see.
Similarly, if your code is so shitty that you're adding bugs to the backlog faster than you're able to fix them, that's a negative for the customer. If instead you spend 2 weeks refactoring so that your bugs+ rate is lower than your bugs- rate, that's customer value right there!
I cannot understand how you don't seem to understand this basic premise of a b2c relationship. Users don't just care about the features they have today, and if your competitor gets to market slower than you but has twice the features a year or two later (due to less tech debt), you're gonna be left in the dust.
I think there's a little confusion around what the 'customer' is here. You seem to believe the customer is the end user. I think of the customer as the person paying for the software. When someone buys some software they usually want it to be reasonably well written, especially if they understand the development process. If you're supplying software to a customer who understands the development process well you'll often find they're more than happy to pay for additional time to do things like refactoring, testing, documentation, etc. The really good customers insist on it and will stop buying from you if you don't include it in the cost.
if there's a feature i want as a customer, i do care if i can get that in two months vs six months vs a year. i also care that it's stable.
reputation and trust are hard to measure, and the effect is delayed. unless you have the killer app or feature, with the internet now, word will get out. many a company has tanked their reputation.
iteration speed also makes you vulnerable to be leapfrogged. Azure vs AWS seems to me to be this. a lot of Azure stuff is great now, some of that is hindsight, but real innovation at AWS is also rare. their developer tools, aka code*, are completely underwhelming. as is cloudwatch. ec2, s3, dynamo, and general lock-in are just enough to keep a lot of people.... for now.
Will you be able to sell more software or increase the price of your software because you refactored some code? No, your customers don't care two hoots about that.
Refactoring is an internal activity. It's part of the cost of development.
Will you be able to sell more software or increase the price of your software because you refactored some code?
Yes I will, because the cost of development will be lower. That makes my customers happy so they give me more work. I'll also have fewer defects which improves my reputation which means I can charge more.
No, your customers don't care two hoots about that.
I mostly write software for SaaS companies. Their customers (the end users of the software) don't care about it much besides seeing fewer problems and getting to use better software, but my customers (the people who own the software) really do.
If you're arguing that you might get paid as a contractor to refactor some code therefore refactoring is a productive activity for a business point of view then you have completely missed the point.
Edit:
I'm not arguing that refactoring isn't necessary or important. It is. But it should be kept to a minimum because it really is a pure cost and does not deliver anything of value to end users.
Your customers are the engineering teams of software business. They sometimes need to refactor their code. But their business would prefer they didn't because that's money spent on something invisible to end users (the end customers).
I'm not arguing that at all. I'm saying that all parts of the development process are important and each part adds to what you can charge for and how much you can charge. If you're good at writing software and understand how to create a maintainable, robust, well tested code base then you can get paid for things like refactoring, documentation, testing, etc. Clients understand that writing software is more than just delivering features.
Specifically in my own case, the company I work for writes software for (mostly) software product companies. When we plan what to do in a sprint 'refactoring' is part of that. We literally charge for the time we spend doing it. How long we spend refactoring is agreed by the customer's project / product manager. They understand why it's necessary and important, and they want us to do it.
In a lot of places, quality is not a priority. Or even, it can be an anti-priority: First you get rewarded for producing prodigious amounts of crap code, then you get rewarded again for the heroic efforts it takes to fix that crap code.
That book was written in the seventies, 45 years ago. It still has valid points but obviously a few things have changed in terms of how we do things. For example, we use version control systems these days and have sophisticated refactoring tooling.
But one of the things that hasn't changed is that we haven't really come up with any better metrics than time spent and nr. of things changed per time unit. There are a lot of people complaining these things are not representative (for the last five decades) but not a whole lot of people coming up with better productivity metrics and even fewer that did the leg work of actually validating their work in something that might pass scientific scrutiny (which is hard). If you do work in this space, you'll find yourself citing articles that are decades old.
These days, I tend to look at the activity statistics on github projects when joining a new project. It tells me in a glance of an eye who are the movers and shakers on a project in terms of lines of code added/removed and amount of commits and the time distribution of those commits. It's not perfect and doesn't tell you the complete story but it it's rarely completely wrong. Usually it's quite easy to spot patterns like Weekends, Christmas, vacations, and that people tend to get back energized from a good break (a little spike in productivity).
Also these numbers confirm the notion of a 10x programmer: a handful of people tends to be responsible for the vast majority of commits and line changes. Not all software engineers are created equally. Diffstats on pull requests tell me a lot as well; though of course these numbers are inflated by e.g. refactorings (more so on languages that have proper tools for this). But refactoring is a good thing and combined with code reviews tell you a lot about the quality of engineering.
Individual engineers don't and never did; managers/planners do (at least the more competent ones) and always did. There's a difference between running a bunch of primadonnas and running an engineering organization that gets things done on a schedule & budget consistently. It generally doesn't involve just winging it with some full stack ninjas and hoping for the best and tends to involve lots of planning, KPIs, metrics, resource planning, etc.
OKRs are a thing in tech circles. John Doerr wrote a book on it. My company rolled it out and most people struggle with coming up with meaningful metrics for their job.
We added OKR junk at an agency I worked at. For developers our options were:
1) use silly, useless metrics that are of a sort management will accept anyway,
2) uselessly tag along with initiatives in areas that are easier to measure (sales, marketing kind of though their metrics are still usually bad, just no-one cares),
3) start a year ago gathering data for a baseline for bad development metrics,
4) start 2-3 years ago gathering data for good development metrics, though they’ll probably still be pretty limited and narrow.
We picked 1 and 2 of course. What a waste of time. I wish anyone who wanted to be more than a line worker anywhere had to answer some basic questions about games and measuring things. Not just in development, managers and directors everywhere are, on average, terrible at it as far as I can tell.
We should probably be using the absolute value of lines of code, so abs(LOC) as the metric, or something like a least squares mean for estimating the moving average of LOC per day.
Anymore, my daily average LOC is probably negative since I tend to rescue floundering projects. I usually avoid object-oriented (OO) programming whenever possible, as I've found that functional one-shot code taking inputs and returning outputs, with no side effects, using mostly higher order functions, is 1 to 2 orders of magnitude smaller/simpler than OO code.
Also I have a theory that OO itself is what limits most programs to around 1 million lines of code. It's because the human mind can't simulate the state of classes with mutable variables beyond that size. Shoot, I have trouble simulating even a handful of classes now, even with full tracing and a debugger.
I'd like to see us move past LOC to something like a complexity measurement of the intermediate code or tree form.
And on that note, my gut feeling is that none of this even matters. The world's moving towards results-oriented programming, where all that matters is maximizing user satisfaction over cost of development. So acceptance test-driven development (ATDD) should probably be highest priority, then behavior-driven tests (BDD), then unit tests (TDD). On that note, these take at least as long to write as the code itself. I'd even argue that they express the true abstractions of a program, while the code itself is just implementation details. Maybe we should be using user stories implemented per day as the metric.
My project at work is 60k LOC, developed over the course of 3 years. It's in production and works quite well. I wrote it all by myself. My hobby project is 100k LOC (2.5 years of development in free time). Both are UI + service code in C++. I code several hours a day. Maybe 3-5 on average.
10 LOC/day is ridiculous. Think about Brad Fitzpatrick, Fabrice Bellard, John Carmack. They would never accomplish anything like they did with those 10 LOC.
You have to have dedication and really good text editing skills. Being smart is nothing if you can't write code fast enough. Good skills with tools like shell, debugger, version control are important as well.
Another problem is that dev collectives these days tend to bully and punish those with higher performance. There are several reasons for that 1) most devs do an absolute minimum just not to get fired 2) job security is a thing, you won't make your project look too simple or complete as this might end your contract 3) at least 90% of hype is from idiots and by idiots. Idiots are a heavy tax on any engineering 4) frameworks, tools and methodologies are often designed for a different scale 5) ceremonia, overcomplication of development processes, treating devs like anonymous, interchangeable units
I'm male in my 40s with a CS degree. I work from home most of the time.
Guys, can I be honest? I have never actually met anyone who has worked in a LOC-optimized company. This stuff seems like the outrage porn of software engineering.
Agreed. Worse, I have yet to see someone that gave better estimates than someone that could give a rough "good faith" loc estimate.
That is, honestly list out roughly what all code the naive way will touch or generate. Without making assumptions about cheating the metric. If you can bring yourself to that, you can probably give better estimates than you'd expect.
Instead, we seem to constantly look for ways to cheat the metric. With no real reason to do so. Other than push the cheating/gaming into a harder sector?
You misunderstand the intention. You don't optimize LOC but you develop software as best as you can. You measure - among other things - LOC and this gives you at least some hard figures in the fuzzy software engineering universe. See it as the Hubble constant - it gives you answers about the universe provided you know it's value.
Then when it comes to understanding your software costs - it helps you to put some numbers to features. Yes it is dark art but so is all other financial magic. When it comes to maintenance or re-engineering software - LOCs and past numbers can be useful but are not the only determinant of future development costs. There is the agile backlog / planning poker school of thought which is certainly an improvement and valuable running the project but when it comes to large scale software projects it is not an answer I would like to rely on when the project needs a price tag before day one.
It is one metric. If you work in any company purely run on metrics - if you ask me - run once you see a better place. If you work in any company not measuring what it does - run now!
I'm talking about all of these comments on this page with people talking about how LOC/day is not a good metric for productivity. So? No one's tying your wages to LOC/day.
> I have never actually met anyone who has worked in a LOC-optimized company.
Not only that, but the higher-level concept I took away is that, given the size of his features (~6k loc), it takes about 75 days to write a feature. Assuming those are workdays, that's 15 weeks, a little over a quarter. And, indeed, tech companies do seem to measure their process and projections in quarters, and everything tends to slip to the right a bit, on average.
Also, the code coverage numbers line up with other estimates I've seen: life-critical systems should have upwards of 95% code coverage. Standard project: 85-90% coverage.
But:
-As a 1-man-band I also do the support, documentation, testing, website, marketing etc. So coding is only part of what I do.
-I don't think the article defines what a LOC is. Does it include headers? Does it include auto generated code?
I wish I had a job where writing 10LOCs per day would be enough. I wrote quarter of a million LOCs at Google alone, and I wasn't the most prolific programmer on the team, not even close. I wrote about as much code since I left Google, too. And it's not Java code either, where your IDE has to crap out 100 lines of boilerplate before you even begin to do anything meaningful. This is mostly C++ and in the last few years Python as well.
The 10 LOC/day includes the entire project, including analysis, requirements, design, documentation (user and tech), test, management etc, in a large project. Saying (as many do) that an individual programmer has produced hundreds of LOC per day is to miss the point.
You have to start by defining your terms of reference and make sure you are comparing like with like.
If you are doing a good refactoring, you may have a massive negative LoC at the end.
LoC added or removed is not a very good metric for anything.
Except if I see a big negative number on a PR, it is usually a great thing. I still have to check what was removed but it usually mean that things are easier to understand.
I'm not sure I add 10 lines of code, on average, to our products a day. Most days are probably negative, to be honest.
Of course, any time I have to add a new Windows Forms dialog to our install/config utility, there's a couple thousand lines of code-behind added to the repo...
I once had to work with some outsourced developers who would do "something(); something(); something();..." rather than "for(int i = 0; i < 10; i++) something();" when instructed to repeat something X times.
It's called loop unrolling and improves performance by eliminating loop control overhead and reducing branch mispredictions. Maybe they had a compiler background :)? /s
If they had, they would know that virtually any compiler can do this on his own.
I also know these examples and I know the people who are writing them. "Copy & Paste" coding is just more convenient for them instead of writing a loop, fizzling with brackets, indation or whatever. They don't produce high entropy code.
Does anyone else question the mythical man month? If I accept the basic assumption of the mythical man month then I also have to accept that NYC is a myth - there is no way that a city that size can possibly function. Lets start out with ten people in NYC and they add ten more people .... My own belief is the MMM is an apologists view of how we currently collaborate to develop software and products.
What do you question about it? The point of the specific essay was that the work of development/engineering cannot be sped up linearly by adding people. You still see this attitude with managers today: The team of 5 is running behind, let's put that other team on the same project, now it's 10 people. But those 5 new people:
1. Don't know the code base or problem domain. So they'll spend months getting up to speed.
2. Will increase the communication overhead (now each person has to coordinate with up to 9 others, not just 4).
On the small scale of this example, you may see good results. 5 more people isn't a huge amount of communication overhead, and if they're experienced (perhaps even formerly worked on the same or similar project) then you'll see an immediate drop in productivity and then a quick rise back to the new baseline. But will that hold with another 10 people? 20 more beyond that? Each doubling will not continue to halve the time to develop the project, there are limits, and at some point the returns will be negative. The additional people will slow it down. Not just the initial slow-down, but also the new baseline after things settle will be lower than before adding them.
For me the main point in MMM was that big things take time and putting more people to work on the project does not always speed it up. The communication overhead grows exponentially as amount of people increases while the productivity only grows linearly.
Large organizations function very poorly in a lot of respects. They're also the only way to get a lot of things done. In a lot of ways, it is often a bit surprising to me that large companies can coordinate activities to come out with useful outputs. On the other hand, there are clearly inefficiencies that don't exist with a 10-person startup.
Am I mad? Did anybody actually read the article? Why is everybody piling on LoC/day when the article is about average LoC over a period of time and how it correlates to the work you're doing and what a realistic average LoC might look like for a sane project. From this point of view, it doesn't matter at all if you deleted 20k lines and have a negative LoC for the day. That's missing the forest for the trees.
C’mon man you should know better than that by now. Here on HN you only read the headline. Nobody actually reads the article. At most you read a few comments to get the gist of what people think the article might be about.
If you're pushing up 0 lines of code on a day where you had no meetings or interruptions, and you aren't working on something truly novel and near-impossible, you took the day off on the company dime. And everybody you worked with noticed, and if you do it regularly - they are just waiting for everybody to reach their breaking point with you so they can push to get rid of you. Sure, you'll find a new job and do it again, but you'll still not have anybody's respect.
This is why people push the most difficult tasks to the most junior developers. Because a dev stuggling to do the 'impossible' looks the exact same way. That way others get to protect themselves from blame.
> This is why people push the most difficult tasks to the most junior developers.
Who does that?
The correct thing is to manage expectations and then do the work. Unfortunately there are many lazy developers (or people in any sort of job) who won't be productive with even the most simple tasks.
However this math is a bit off because in the course of such 10 years I wrote quite a number of side projects, but still, max ~50 LOC / day.