Hacker News new | past | comments | ask | show | jobs | submit login
Size is the best predictor of code quality (2011) (vivekhaldar.com)
232 points by crapvoter on Nov 11, 2022 | hide | past | favorite | 188 comments



I've come to realise that the size of a code base measured by it's features and components is very different to the kind of "size" which you might measure by line numbers.

It's generally a good thing to write code which takes up more space if it's clearer to understand what's going on. However the size of a codebase in terms of its features and components is almost always directly correlated to the numbers of bugs you're likely to find and the difficulty you'll have maintaining it.

A problem I have at my current place is that the dev team in my opinion is far too happy to add new features whenever they're requested. You know those kinds of requests that go something like, "I basically want something that does exactly the same as [x] expect with this one small difference". When that happens few times it's fine, but when it happens 100 times suddenly you're managing a code base which is many times more complex than it needs to be with very marginal benefits.

And you might argue, "well why don't you abstract more". And that's a lesson that took me about a decade of software engineering to learn - if you can avoid abstraction you should. Writing, $a = 2 + 2; is far easier to understand than, $a = math.sum($num1, $config->get('num2')); The need for abstraction should be seen as a code smell. That's not to say you shouldn't abstract when it's the right thing to do, but if you find yourself needing to create abstraction on top of abstraction your feature set is probably needless complex.

A good engineer in my opinion isn't one who can build things well, but one who is willing to say, "no I'm not building that at all" and even remove features that are being used if the cost of maintaining those feature are not worth the benefits they provide.


This.

The best codebases I’ve seen have an abstraction layer at the foundation and the rest is just dumb repetitive code built on top of that layer.

It’s tempting to look at that and think we can abstract it out a bit more, but then you cross the line pretty quickly and your abstractions now mean you have 20 parameters to control how the abstraction should work.


I think the main problem is "pre-abstracting", "this looks like it might be used often, lets write quick layer of abstraction on top of that so there is less repetition". Then you end up calling it once, maybe twice in code, and need to bounce thru 3-4 files to even get to see what the code is actually doing. Also makes anyone new to the code have to go thru same dance.


I agree with you to some degree, but I've built an abstracted data layer in literally every project I've worked on since I learned it was a good practice around 20 years ago and I still have yet to be on a project where we actually needed to replace the database. I've moved plenty of databases from on-prem to PaaS, but never switching DB technology itself through an abstraction layer.

That's not to say I haven't had to replace data storage, but they were never built as layers and inevitably it happens when moving from mainframes to more modern systems. At this point at least some of my abstraction feels like a waste of time. Maybe when we start moving to Quantum Databases someone will benefit from my extra due diligence from a decade ago.


I don't think building an abstraction over the data layer is just to make it easier to change your database. The abstraction can define the data access patterns upfront and minimize bespoke data access code. I've spent a lot of hours trying to untangle ORMs and other data access code from business logic. I've found that extending an abstraction to support a new access pattern (that you really need) is much easier.


This is one of the bigger discussions I've been involved with lately; and I agree it really comes down to "bespokeness".

When you have a 'system' which is 500 LOC, and 10 usages of that system totaling 10 lines each; compared to 10 bespoke implementations of 50-100 LOC each... you get into a spot where the system is battlehardened because of all the users, while the bespoke implementation have issues all over. Then when someone goes to add the 11th use-case of this; they are going to learn from and copy/paste the other version, probably exposing some latent bug in the bespoke version.


I don't think MS built Entity Framework so that you can switch away from MS SQL Server in any other way than "potentially".


IMO, it's a lot like building a house.

It's okay to have someone else manufacture the bricks for you. However, you still have to lay them by hand. It's tedious, repetitive, and labor intensive, but gets you the exact result you want.

The second you try to pre-cast bricks, you end up with something harder to work with.


There is a much simpler reason big code bases have less bugs. The size of the codebase is correlated with the investment a company is making into this codebase A larger code base will have seen many peer reviews from different Engineers over its lifetime, and as it is likely of the companies core products, a lot of testing, both internally by the test team, and externally by customers.

On the other hand my hacky script, put together on a sunny summer afternoon, has zero love from senior management as they probably don't even know it exists.


> A good engineer in my opinion isn't one who can build things well, but one who is willing to say, "no I'm not building that at all" and even remove features that are being used if the cost of maintaining those feature are not worth the benefits they provide.

I'm not sure what domain you work in, but in mine, the feature set is determined by the business requirements. e.g., if you're writing an app, you can't say to the client, "Don't add this potentially revenue-generating feature because it'll introduce bugs." What the article is suggesting, is that writing the same feature set with less code is likely to lead to a more correct implementation.


TL;DR: the key is to realise that you wouldn't be denying them a revenue stream. You would help them improve it. Look at it from a broader perspective.

If you say "You've projected so-and-so much revenue from this feature. Our projection is that it's implementation will cost this, future maintenance that, and it will introduce bugs which will lose you this much revenue and cost this much to fix. We propose you change the spec this and that way, and you will get out of most of the drawbacks while still retaining most of the benefit"

any sane business would not go "shut up and go home" but rather engage "okay, but what about sub-requirement xyz -- did you consider that in detail?" and from there on you can create something amazeballs together.


Either way you’re still fulfilling business requirements. If there’s a more efficient way to serve the same goal, you’re talking optimisation, and that absolutely aligns with a smaller code base (again the actual point of the article), but not ‘less features’


Well, sometimes it's better for the business with a feature present, sometimes with the feature absent, sometimes with the feature exchanged for another that's even better. It all depends on the feature in question and the business context.


And to my original point: that a business requirement adds more potential bugs is not sufficient to obviate implementing it. The only sufficient argument is that it doesn’t provide the product owner as much value as it costs to implement and maintain, but rarely is the developer in a position to judge that - they just provide the cost estimates and the PO can decide if they want to wear it or not.


You rarely see business requirements. You see an interpretation of them. Your job as an engineer is to understand the actual business requirements so you can question the interpretation of them and then simplify it.

With that said, many requirements are technical and then you've even bigger freedom.


> You rarely see business requirements. You see an interpretation of them.

At the end of the day, you answer to the bottom line. If you can achieve what is needed of your program (i.e., the business requirements) with less code, it seems you're less likely to have as many bugs - the point of the article. I think here, a good choice of expressive language/framework can pay dividends, moreso than trifling over the product owner over what functionality should go.


It’s the developers responsibility to question the feasibility and risk of the clients business requirements. The ones in charge are of course taking the final decision in the end anyway, but the flag should at least have been raised.


I think you misunderstand, some stuff has to go in but it is absolutely necessary as a dev to push back[1] on bad requirements, or at least that's my take.

[1] admittedly in some cases you can't for various reasons, including bosses that won't listen.


I probably could have been clearer about that.

I think as an engineer it's your job to do a bit of cost / benefit analysis on what you're being asked to build and at a minimum provide a rough estimate about the time / effort it will take to build and maintain so everyone is on the same page. And sometimes I think it is your job to argue that something shouldn't be built at all.

I guess a good example of this would be at the place I worked a few years back... When I first got there one of the complains I would hear a lot from the business was that the site was too slow and that every time they asked the dev team to improve it they only managed to make limited improvements.

So I took a look into it and realised that the main issue was that we had something like 20 marketing scripts running on the site. When I started asking why we have [x] script someone would always provide me a valid reason, "Oh, Sally and her team use that for [y] reason". Okay fine, so then I ask Sally if we can remove it, but she obviously says no because this specific tool best suites her team's needs.

You could argue no single dev was really to blame here. At each step along the way they were just adding a single script to fulfil some business requirement, and as a team the devs couldn't reduce the number of scripts to improve performance because they were all deemed essential by the business.

But I think this is exactly where a good engineer would start pushing back against the business. Given this was a large corporate I spent a few weeks escalating the issue higher up the business until I finally got authority to start removing scripts and was able to put in a process for requesting new ones to be added. Imo the main issue here wasn't even the scripts themselves but that the dev team just didn't have enough authority about what they were being asked to build so they'd knowingly build crap because was what the business was asking of them.

Me pushing to remove the majority of these scripts and give the dev team more authority pissed a lot of people off. Several teams were arguing that it was going to have an impact on their performance, and in some cases that they couldn't do their job at all. Higher up in the business people were pissed because now the dev team started saying no to them for the first time ever, but overtime the benefits of this became obvious to most people. Site performance improved dramatically, the dev team were spending way less time fixing bugs, we were able to prioritise dev tasks like upgrades, etc. And eventually a few people even thanked us lol.

I guess I'd never actually say, "no I'm not building that at all" - this is just how I feel a lot of the time. I think it's more about explaining costs and trying to get everyone onboard that something shouldn't be built when it doesn't make sense.

> if you're writing an app, you can't say to the client, "Don't add this potentially revenue-generating feature because it'll introduce bugs."

See this actually worries me - why do you feel like that? If you think what you're being asked to do could introduce bugs and you're not convinced about how much revenue it will generate then you probably should be having a conservation with the client about whether it should be built in the first place imo.


>See this actually worries me - why do you feel like that? If you think what you're being asked to do could introduce bugs and you're not convinced about how much revenue it will generate then you probably should be having a conservation with the client about whether it should be built in the first place imo.

I feel like I'm repeating myself a lot so I'll point you to another comment:

https://news.ycombinator.com/item?id=33573035

> And to my original point: that a business requirement adds more potential bugs is not sufficient to obviate implementing it. The only sufficient argument is that it doesn’t provide the product owner as much value as it costs to implement and maintain, but rarely is the developer in a position to judge that - they just provide the cost estimates and the PO can decide if they want to wear it or not.

It's more an issue of competent estimation and communication than gatekeeping. If the complexity costs have been adequately communicated by the technical staff to the product owners/managers/stakeholders/etc, and they insist on decisions that both add significant maintenance cost and, by your humble measure, aren't positioning the product better, it's time for you to move on to different company/client.

> So I took a look into it and realised that the main issue was that we had something like 20 marketing scripts running on the site. When I started asking why we have [x] script someone would always provide me a valid reason, "Oh, Sally and her team use that for [y] reason". Okay fine, so then I ask Sally if we can remove it, but she obviously says no because this specific tool best suites her team's needs.

Not sure this is in the scope of what we're talking about. Intra-organisation conflicts on the business requirements don't mean that business requirements don't have primacy here, it just means those requirements were badly specified and badly maintained. Perhaps a faster website is a greater requirement than the scripts were for the marketing department, but the business requirements for marketing haven't gone away, and really an alternative and performant system should have been made in their stead to enable marketing to do their job well again, again, if its decided that this provides values for the business. And the developer doesn't really have the authority to speak on whether it does or not.


> A problem I have at my current place is that the dev team in my opinion is far too happy to add new features whenever they're requested.

> And you might argue, "well why don't you abstract more".

A place I work was acquired by a company that owned our main competitor. When we were comparing them, I used to joke that the team had spent 10 years saying "yes!" to literally every feature request.

They of course did it with abstractions, so it "wasn't that much more complicated".

Feature-for-feature, our software could only maybe 15% of what theirs did, while theirs could do 95% of ours. The difference was the features we picked were the only ones 80% of the customer base actually used.

Worse, because everything was abstracted, setting up those simple use cases was usually an order of magnitude more complex for the users: something that was a checkbox in ours was 15 different toggles and inputs, and very easy to screw up. Sure it could also be setup to handle hunderds more scenarios.. but in reality only there were just a few that only a tiny fraction of users even cared about.

That, to me, is the danger of abstractions - at least it they make it up to the UI layer. QA (and automated testing, if it was written) still has to test the hunderds of scenarios because they're possible. Support has to understand. Dev has to maintain.

It was interesting to compare the approaches, at least, and I came away with the conclusion: I never want to build that way. I'm all for internal abstraction: build the capability and flexibility to expand in, but have a very concrete and constrained UI that only exposes what actually has real value. And make it dirt simple to do the most popular use cases.


I could die a happy person if I never have to hear programmers user the word “abstraction” again.

Exposing more switches is the opposite of an abstraction.

> , but have a very concrete and constrained UI that only exposes what actually has real value.

This is an abstraction.


Funny, everything you wrote can be applied to functions and classes too. Almost like software development can be summarized as "the balancing of WET and DRY".

For me, I follow the principle of three: if it's repeated three times with minor tweaks, I should probably make some kind of abstraction for it. I try not to start with abstractions, as they tend to distract from solving the initial problem.


WET has negative connotations people have used AHA instead for avoid hasty abstractions.


TRY: Try Repeating Yourself (as in, "Try repeating yourself first, and if that works, then good! If it doesn't, then okay, fix it then—but not before trying first.")


> It's generally a good thing to write code which takes up more space if it's clearer to understand what's going on.

Eskil Steenberg has talked about *width* of code as a more likely quality indicator.

Consider the difference between:

    func(int i, void* p)
And:

    modulename_objectname_operation(int count, void *ptr)
Reference: https://youtube.com/watch?v=443UNeGrFoM


"I think C is the greatest language, for all the reasons other people hate it"

I'm sorry, nothing intelligent can come after statement like that, and I'm not wasting 2 hours to verify that claim


Your loss. It's one of the best intermediate C resources on the internet.


> No I'm not building that at all

My guess is most devs learn this the hard way. You are liable for bugs in code you write, regardless of project quality and how much you complain to management about it. Apart from the examples you describe, I'd like to also emphasize cognitive load VS size/complexity. After a certain threshold, you shouldn't add more features but become a maintainer/tester of the ones you previously added (until you let go of responsibility/ownership).


> However the size of a codebase in terms of its features and components is almost always directly correlated to the numbers of bugs you're likely to find and the difficulty you'll have maintaining it.

Fully agree. In addition, I find there's also the time component. Because requirements for existing functionality change over time, both code and data become inaccurate and a mess if they are not refactored. This leads to more bugs and to less trust in the data model (e.g. when can something be null).


> I've come to realise that the size of a code base measured by it's features and components is very different to the kind of "size" which you might measure by line numbers.

How different is it, really? Any time I've done the research, it turns out line count correlates pretty strongly with feature complexity. Not perfectly, of course, but given how cheap line count is to produce compared to a quantification of feature complexity (remember function points? that don't even consider complex interactions between features!), it's a pretty damn good metric in my book.

(Fun thing you can do: first try to estimate how much a software solution you're intimately familiar with has grown in feature complexity the past year (about 60 % for one project I work with), then look at how much it's line count has increased the same period (exactly 79 %). Previous year I had a project with a 30 % × 15 % pair. The numbers won't match exactly, but they will most likely be in the ballpark. You won't find a project with no complexity growth double its line count, or vice versa.)


> no I'm not building that at all

Oof. I work in finance and if you say that to a trader/business manager you'll be out of a job. Rightfully so.


Indeed, "I'm not building that" doesn't sound good. You can of course discuss the True Cost (TM) of an impractical feature request with the client, and propose a less expensive alternative.


The magic statement is "it will cost this and that in immediate and ongoing cost of keeping such thing in codebase"


I really think that the best metric is size, as in lines of code, like the article seems to to suggest. Personally I may even be tempted to use bytes.

For obvious reasons, number of features correlate with size, but I also noticed that bad code tends to be too big for the little it does. Dead code, repetition, rewriting functions that are already available because you don't know your API, etc...

Note that it is a metric, not a goal. Making your code smaller won't necessarily make it better (ex: minification), but big code is more likely to be bad and successful efforts to make it better will most likely make it smaller.


The code is as complex as your business is. If your clients want these tweaks, you cannot get back to your sales department and say "no, we will not do it, it will make our code base too complex". You either introduce some abstractions, or lose that $Nmm contract.


> And you might argue, "well why don't you abstract more". And that's a lesson that took me about a decade of software engineering to learn - if you can avoid abstraction you should. Writing, $a = 2 + 2; is far easier to understand than, $a = math.sum($num1, $config->get('num2')); The need for abstraction should be seen as a code smell. That's not to say you shouldn't abstract when it's the right thing to do, but if you find yourself needing to create abstraction on top of abstraction your feature set is probably needless complex.

On a technical level, I fully agree with this. Someone recently asked me why I didn't use an overengineered abstraction that they made for a feature request, which was not only complex, but also had no documentation or other usage examples. Actually even discovering that it was there wasn't something you'd naturally do when working on that codebase, given that the discoverability isn't exactly excellent and people don't communicate that much.

Explaining to them that using it would have slowed me down, just how working with some other of their abstractions in the past has, was an uphill battle (and many of which weren't precisely the right abstractions and needed further refactoring to be useful), despite the fact that the current implementation without the abstraction does what's necessary and works without issues in prod, as well as is simple to refactor/throw away when necessary.

In the end I simply conceded to do some refactoring and use their abstraction... but probably much later, since I have other things to do. I find myself caring less about the codebase and intricate mechanisms in it with time, because on a technical level it's boring and there really is little to no value added by bunches of complexity in it.

Of course, there are a few cases where abstractions make sense, but in my experience developers (myself included) love to overengineer things and feel clever - they build their own castles atop big balls of mud.

> A good engineer in my opinion isn't one who can build things well, but one who is willing to say, "no I'm not building that at all" and even remove features that are being used if the cost of maintaining those feature are not worth the benefits they provide.

In regards to features, however, in many environments out there you won't get the ability to say "no".

Someone who uses whatever the system you're developing is will have some requirements. A business analyst will turn those into a user story with acceptance criteria. You'll have to implement everything for it to be considered resolved, otherwise QA will throw the ticket back at you and indicate that some of the requirements haven't been met.

I'm not saying that that's how we should build software, but rather that in many places that is simply the reality of building software and it's not like you can always quit and work elsewhere, when you need the money (not every place is as well paid as the US) or when the development culture is like that in the entirety of your country/time zone.


I agree with you except on the topic of abstractions.

Everything in software is an abstraction. Source code is an abstract representation of the running program. Functions are abstract representations of various execution that differ only in their parameters. Classes are abstract representations of stateful concepts of various instantiations that vary only in the information each instance contains. Etc.

You cannot "not use" abstractions when you program, you can simply design poor abstractions or design good abstractions. And a lot of people underestimate the second degree impact of even the smallest of abstraction design choices, like what is and isn't factored into its own function, what parameters the function takes and return, what classes we have, how big they become, etc.

And while saying no to new use cases that are almost same but slightly different is a good way to avoid needing to abstract more things (or avoid needing more code in general), and good engineers push back when necessary here, you will eventually have to deliver what the users want, and you cannot always convince them that all users want the same exact behavior as all other users, there will sooner or later be good business justifications for it.

That's when if you don't introduce good abstractions to tackle the problem, you'll end up in an unmaintainable mess which will turn your velocity to a crawl.

But it's important to understand abstractions have to be designed in tandem between product and tech, because how the user feature is exposed is also an abstraction, both the feature abstraction and the tech abstractions have to be well designed and must help each other out to remain manageable and coherent and work well with other features.

In my opinion this is where really great engineers distinguish themselves, it's the ones who can figure out the good way to abstract that delivers that business international expansion where each region has slight variations yet feature parity can be maintained accross region, as an example of when you won't be able to push back and the abstractions will become critical to success. Normal engineers brute force their way through this slowly, each new region taking longer to expand too then the previous took, and with new features being impossible to launch globally anymore. While great engineers figure out the right set of abstractions alongside the product to enable fast expansions with judicious configuration/plugins to handle per-region variations, etc. They invest in tooling, frameworks and platforms that create those kind of abstractions, and just have the good instinct around balancing tradeoffs to deliver simple yet powerful abstractions that actually makes you faster and the system more maintainable.

Bottom line, I find abstractions are critical to success, saying to avoid them I believe is misleading, you cannot avoid them, you need to get good at them instead.


> And a lot of people underestimate the second degree impact of even the smallest of abstraction design choices

This is one of the hardest lessons I've had to learn as an engineer. Building extensible abstractions upfront is extremely difficult to achieve without building a lot of abstractions upfront that aren't extensible, and learning why.

> you cannot avoid them, you need to get good at them instead

I'm working on it!


> You cannot "not use" abstractions when you program, you can simply design poor abstractions or design good abstractions

While I agree with the comment as a whole, I think a part of the argument is about not introducing unnecessary abstractions which might couple code that might be better off remaining separate.

As another commenter put it:

> The best codebases I’ve seen have an abstraction layer at the foundation and the rest is just dumb repetitive code built on top of that layer.

https://news.ycombinator.com/item?id=33569197

Suppose that you're building a system for supporting a sales process and keeping track of bunches of data. And based on some BPMN diagrams, you might have multiple different sales processes to keep track of: A, B, C. At a surface level, all of them deal with some company receiving money in exchange for goods and services, but the particulars of each process (the steps involved, the documents, the state transitions etc.) are going to be different. What's worse, they're going to change with time and evolve separately.

So, the aforementioned and boring architecture would be a bit like (which might get created across 5 years, as new processes need to be supported):

  SalesProcessAResource/View (REST/UI) <--> SalesProcessAService <--> SalesProcessARepository <--> sales_process_a_schema
  SalesProcessBResource/View (REST/UI) <--> SalesProcessBService <--> SalesProcessBRepository <--> sales_process_b_schema
  SalesProcessCResource/View (REST/UI) <--> SalesProcessCService <--> SalesProcessCRepository <--> sales_process_c_schema
Each process might have a separate schema with different tables, different services and repositories and resources for them (not unlike microservices, but assume that this is within a single codebase, maybe modules). There would certainly be overlap which wouldn't be ideal and in the first iterations those might seem somewhat similar/redundant, but this way you can evolve each part separately.

For example, if your sales process A now needs to have products have groups in front of them, or even products that can only be bought as package deals/sets, then you can change all of the layers:

    SalesProcessAResource/View (REST/UI) <--> SalesProcessAService <--> SalesProcessARepository <--> sales_process_a_schema
And then there is basically no risk of you ruining or breaking the other sales processes, because they're all decoupled and are handled differently. Now, the alternative to something like that is:

  SalesProcessCommonResource/View (REST/UI) <--> SalesProcessCommonService <--> SalesProcessCommonRepository <--> sales_process_common_schema
I've often seen is developers trying to create "the one solution to rule them all", which leads to code that's really abstracted, maybe the SalesProcessCommonService has some abstract methods that are implemented by separate services, which makes your code paths harder to track, maybe there are now enums all over the place or your code is full of if/else or switch constructions.

Worse yet, your schema might look like swiss cheese, some of the columns always being empty because the data normalization isn't up to par, or perhaps going in the opposite direction and needing to query 10 different tables to get the data that you need, or worse yet "classifier systems" that I've seen a lot of in my country, basically the worst of OTLT/EAV patterns: https://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-bi...

I've personally seen many similar outcomes in projects, all the way to needing to introduce some data in the middle for one business case, and creating faux records there because the schema doesn't work otherwise, for example:

  SELECT
    ...
  FROM sales_processes
  INNER JOIN sales_sets
    ON sales_processes.id = sales_sets.sales_process_id
  INNER JOIN sales_products
    ON sales_sets.id = sales_products.sales_set_id
    AND sales_sets.code = 'NO_GROUP_FOR_PROCESS_ONLY_LINK'
  WHERE
    sales_processes.code = 'SALES_PROCESS_A'
or something like that.

Of course, one can come up with the right abstractions for something like that, or extract common parts of code, but only as long as they're not logically decoupled and the code is only incidentally the same (or close to it).

Also, in some cases you might need a time machine to see what the requirements will be in a few years, otherwise you risk digging yourself a hole by over-abstracting things and going off in the wrong direction, to the point where you'll have thousands of person-hours sunk into an inadequate implementation, or at least one that becomes inadequate in the light of changing requirements.

That said, there are definitely easy cases where abstractions are of use without giving it too much thought. Think along the lines of code that is either purely functional or close to it, "utilities" that you can use as necessary with simple function calls or maybe injected dependencies, versus something that makes you restructure your entire code because someone went on a power trip after reading an OOP book.


It's the Swiss cheese paradox: "cheese has holes; bigger cheese have more holes; therefore, the more cheese, less cheese".

> However, I still haven’t found any studies which show what this relationship is like. Does the number of bugs grow linearly with code size? Sub-linearly? Super-linearly? My gut feeling still says “sub-linear”.

We know from observing reality that even buggy software is better than no software – a software so buggy it adds negative value is a rare exception. So I guess it has to be "sub-linear", otherwise software wouldn't have economies of scale.

The contradiction though is: how do we explain the apparent stability of large but mature codebases? Thinking something like Emacs for example here, where some commits date back to 197X. I guess at some point it becomes an inverted U shape even if codebase grows, if it grows at a conservative pace.


> a software so buggy it adds negative value is a rare exception.

Survivorship bias will explain all of that. You'll rarely encounter negative software because it gets discarded. It doesn't mean it's never created, it just dies before getting wide visibility.

> otherwise software wouldn't have economies of scale.

The fact that software can be developed once and sold indefinitely means the marginal cost is near zero. That is an unbelievably huge economy of scale that would dwarf many other superlinear problems.

> how do we explain the apparent stability of large but mature codebases?

Not all software changes are additions. If assume that codebase size is positively correlated with number of bugs (i.e. more code = more bugs) and that most changes that do not grow the codebase are likely to fix at least one bug, then mature codebases should be expected to improve in quality to the degree that number of changes outpaces total codebase size.


> The fact that software can be developed once and sold indefinitely means the marginal cost is near zero. That is an unbelievably huge economy of scale that would dwarf many other superlinear problems.

I'd propose a slight revision: "software can be developed once and used indefinitely." I would bet that most code nowdays is written inside organizations and never directly "sold" over and over again to a customer. But even that tends to result in those organizations wanting more and more, because even a buggy one tends to save people time compared to doing the process manually, and then there's another process they want to automate next, etc...


I'd propose replacing "indefinitely" with "until the systems it interacts with require changing it". It takes surprisingly short time for a mobile app that is written to current standards to turn into something that doesn't run on latest updated OS version. It's a good point that selling software indefinitely isn't possible. There are only a limited amount of people who want the software and are willing to pay for it.


It's still effectively "sold" in economic terms. You have one codebase and it generates revenue per customer (where "customer" may either be an end user, or an advertiser for ad-supported software).


I had to kill one of those projects. It was so buggy it was causing issues in "unrelated" areas. I find out Thanksgiving week if I killed it early enough for the next release.


Software size is positively correlated with bugs, but software age is negatively correlated? If your codebase doesn’t grow much in size but grows older, the quality should become better.


The best software is very old and has been polished to a mirror-like gloss.

Having any ongoing development means you get an equilibrium level of defects, though.


> The contradiction though is: how do we explain the apparent stability of large but mature codebases?

The problem with these sort of analyses, in my mind, is that they seem to neglect the iceberg of code you build on top of, which works perfectly fine and is virtually bug-free. For instance, imagine that I'm building a website. If size is a predictor of quality, then you'd expect me to add bugs as time goes on, and that's probably true. But my site is stacked on top of Chrome, an abstraction with hundreds (?) of millions of lines of code that virtually never give me an issue. This is stacked on top of the typescript compiler, the C++ compiler, the operating system, etc, and those all run in the millions (TS) to tens or hundreds of millions (OS) of lines of code! So while I don't doubt that codebases get buggier over time, this seems to neglect the fact that I use truly immense amounts of code every day that don't seem to run into an issue. Perhaps the bugs grow around the edges rather than in the center, or something like that?


Chromium has 63,438 open bugs in its bug tracker so the theory remains true, but maybe the broader point is that strong abstractions have enabled us to reach great levels of complexity. Testing, staffing, and programming best practices probably go a long way in making software like Chrome appear to work flawlessly from the outside.

An analogy would be the craftsman-factory. A craftsman knows and performs every step of the process, and the end product depends on the skill of the craftsman. While a factory each worker is limited to a specific job and the combined effort of factory workers allows products well beyond the reach of any craftsman.


> Chromium has 63,438 open bugs in its bug tracker so the theory remains true

How many of those bugs affect you on a day-to-day basis? For me, the answer is 0. In fact, I can only think of one time ever when I was affected by a browser bug as a developer. I think that most of the reason Chrome has so many bugs is probably just because a staggering amount of people use Chrome, meaning they can inspect every nook and cranny.

For a somewhat contrived example, imagine I write a program to add two numbers together. I write a single line of code: `var foo = bar + baz;` in JavaScript. If I ship it to no one, I can claim it has no bugs. But if I ship it to a million people, then I might start getting bug reports from people running it under extremely precise circumstances and running into issues. For instance, maybe I'd get a bug where two numbers sum to be negative because the user ran into integer overflow when bar and baz are very close to INT_MAX. I might also get reports of lack of precision when bar is extremely large and baz is not (e.g. sometimes foo + 1 === foo in JS if a is huge). I suspect most bugs in Chrome are of this variety, or, more likely, they are even more obscure.

It seems to me that it's not a strict 1:1 correlation between size and quality - there's also a factor where more popular software has a lot more people looking for bugs.


https://bugs.chromium.org/p/chromium/issues/list

The majority of the bugs I see are failed tests, null dereferences, unreachable code reached, very slow code, UI bugs, crashes, etc.

Come to think of it, I've had 2 chrome bugs on my phone in the last week or so. Some weird UI bug with tabs, and a more severe one where it just kept crashing once in like 10 seconds. It magically stopped after a while.


> e.g. sometimes foo + 1 === foo in JS if a is huge

Superfluous use of the strict equality check there. An ordinary comparison (i.e. using `==` aka "double equals") is sufficient.


How do you know that foo isn't a numerical string?


Because I'm responding to the scenario you actually laid out, not counterfactuals.


Things like chrome, or Linux, are so complex, and interact with so many different variables that it's completely understandable that it can have tens of thousands of bugs, and yet most people won't ever notice one.


I would guess that this theory applies average codebases of average complexity and quality.

Chrome, popular compilers, popular operating systems skew towards the highest end of quality and developmental rigor.


Maybe another variable at play here is:

Evolution: modifications_size / time / codebase_size

Mature systems tend to have lower Evolution. Thus, there are more bugs being fixed than created. Over time, this leads to stability.

On the other hand, one bug fix creates three more bugs, on average. So, I don't know why on Earth stable systems are even possible...


>one bug fix creates three more bugs, on average

Sounds really interesting, where is this stat from?


It contradicts the stat that I have heard (which I can't document either): Fixing a bug has 20%-50% chance of creating a new one.


Just kidding! :D


> Mature systems tend to have lower Evolution. Thus, there are more bugs being fixed than created.

Ideally. But in practice, they must add features to sell more copies. And even if they have no evolution, OS's evolve that make the software worse over time (e.g. "modern" UIs, hardware vs software rendering, etc.).

We see this in many products that were once best-of-class, e.g. Paint Shop Pro, ACID Pro, etc.


> why on Earth stable systems are even possible...

They aren't, but it just looks like they exist. The more complex something is, the less likely it is to notice a bug in the complex dependencies.


Taken literally - "why on Earth", as in, in nature - it becomes a matter of timescale. Something we seem to be only fully realizing now - ecosystems and climate are in constant flux; they may look stable on a timescale of a human lifetime, or the lifetime of a society, but at the scale of several centuries or more, they're chaotic.


It's not so much "sub-linear" as "capable of being bypassed".

The CEAC State Department website is the most common one for me... the file upload only works in incognito and "once" per session. So people have figured out that you can log out, log back in and keep uploading your immigration paperwork.


The larger the piece of software, the higher the chance of a workaround for any given bug existing?


> We know from observing reality that even buggy software is better than no software – a software so buggy it adds negative value is a rare exception. So I guess it has to be "sub-linear", otherwise software wouldn't have economies of scale.

Your logic is flawed here. First, software has economies of scale because it's easy to produce the second copy of a program once you have the first one. It has nothing to do with lines of code.

Second, buggy software being better than no software would still be true even if bugs were super-linear in the number of lines of code.

I'm pretty sure that bugs aren't sub-linear. As you pile on more and more lines of code, the number of bugs per thousand lines goes down? The code gets better as you create more and more of it? Experientially, no.

I think it's super-linear, for a couple of reasons. First, the bigger the program, the bigger the team. The bigger the team, the more below-average coders on it. Second, the bigger the program, the more room for weird interactions between obscure conditions.

No, it's definitely not sub-linear. Bugs per line of code do not decrease as you have more lines. I suspect it's


A bigger team doesn't create buggy software because it contains below average coders.

It creates buggy software because the team members have to coordinate with more people.


> A bigger team doesn't create buggy software because it contains below average coders.

But it must, by the very meaning of "below average coders" - it's implied that we're rating hypothetical coders on software development skills, which include the ability to make correct decisions (both at code and architecture level).

> It creates buggy software because the team members have to coordinate with more people.

Those are not independent variables. The bigger the team, the bigger coordination problems, the more bugs. But also, ceteris paribus, the worse the coder, the more problems there are coordinating with them, thus the more bugs. And also, ceteris paribus, the worse coders are on average in the team relative to global average, the more of them are needed for a project, which makes the team bigger and coordination more difficult. Etc.


Number of bugs can grow super-linearly while their impact might be decreasing preserving the overall usability increases. Basically as your software grows there's more corner cases that will get tested very rarely if ever.

Number of possible states grows almost exponentially with code size, so it would be very weird if the number of bugs grew sub-linearly.


> We know from observing reality that even buggy software is better than no software – a software so buggy it adds negative value is a rare exception.

Whom observing which reality?

I have made no scientific study - I am not stating any sort of strong opinion, just an uneasy feeling - that about 90% of the software and uses of computers subtract from humane well being and welfare, not contribute to it.

Looking at you Javascript frameworks.

Looking at you databases and tracking.

Looking at you spyware and advertising bots.

Looking at you killer robots (yes they exist - mostly thinking of the horrific USA military drone assassination programme)

So many examples to choose from. I should not complain, I make my living from computers and the uses I facilitate are no shining examples of improving people's lives


i think you are mixing two concepts s imperfect software like databases and JavaScript frameworks, the others aren't imperfect they are evil, they produce what they want to produce, yes is negative for society, but not incorrect.


> how do we explain large but mature codebases

If the codebase/project became "feature complete". Once a project becomes feature complete, then finally bugs are able to be squashed without reintroducing new ones. The code becomes better because the deaign begins to accurately model the program since there is no feature creep.

The problem with large programs is when you _continue_ to add on to them. Non-open source very rarely gets to the feature complete stage.


> how do we explain the apparent stability of large but mature codebases

I would hypothesize three potential factors are at play, especially for software like Emacs.

The first is that some bugs becomes, themselves, "features" over time. It might be wrong from the standpoint of what was intended or desired when the code was written, or what is documented, but because that is how it works, that's how it is now expected to work, and so it's no longer a bug.

Another factor is that some bugs are "shielded" by the surrounding code, effectively two pieces of code offsetting each other's bugginess. Unless and until you either replace one or the other, or need to interact with one or the other in isolation, the bug isn't apparent. Kind of like how going to the chiropractor for one problem can "unlock" other, latent, problems.

The last one, somewhat related to the first two, is that some well factored software systems do a very good job of isolating their most stable parts, leaving their less stable, more commonly changed or iterated parts in separate modules or components. Because the change rate and focus of effort is on the "outer" parts of the system, less time and energy is spent on the "inner" part. Very often, developers of the "outer" parts will accommodate the bugs of the "inner" part, because changing the inner part requires both greater knowledge of the overall system, as well as the potential for a larger "blast radius" (and therefore assumed responsibility) for ones change. So rather than debugging the "inner" part, the "outer" part grows lots of accommodations for the inner part. It's not until that inner part can no longer be accommodated, or it becomes a security vulnerability, that the community developing the system will see the value in "fixing" the "core".

Anyhow, just a set of hypotheses.


eMacs has the Church of eMacs backing it, so it might not be a good example of a normal codebase. I personally feel a compulsion to dedicate time to giving back to the eMacs ecosystem in a way that I can’t extend to any other codebase, and there are others that I likely depend on to a much greater extent.

Jokes aside and that being said, I agree that there is something helpful about slow, organic (in your words, conservative) growth. It’s neither necessary nor sufficient for stabilization, but it certainly seems to correlate. GNU in general seems to be a good example.


> otherwise software wouldn't have economies of scale

Does it? I tend to think it doesn't (in terms of code size; obviously it does if we're scaling by number of users)

Even if bugs scale sublinearly in practice- my gut tells me it's because development as a whole slows down even more than the bugs accelerate, partially making up for it

I.e. suppose your codebase size grows 5x, and the bugs per change grows 10x, but then the org's rate of change also slows down by 5x. Now your bugs per time has only grown by 2x.


Agree that a lot of it has to do with rate of growth. Compare emacs or postgres codebases with some of the horror shows that are the codebases at VC-hyper-growth startups.


If you fix commonly occurring bugs, you're left with rare bugs - and there may be tons of them in mature codebase. For legacy systems with dedicated operators, it may make more financial sense to teach users a workaround to a bug, than to fix it.


> We know from observing reality that even buggy software is better than no software – a software so buggy it adds negative value is a rare exception. So I guess it has to be "sub-linear", otherwise software wouldn't have economies of scale.

I'm not sure about this, our e-health system had millions spent on it and eventually still was discarded, after "sort of" working for months and slowing down all of the doctors and pharmacy workers who had to deal with it, as opposed to just using paper slips for prescription medicine, or even individual hospital data management systems: https://www-lsm-lv.translate.goog/raksts/zinas/zinu-analize/...

In addition to instability, even in my country, Latvia, there have also been large enterprise systems that have had serious performance issues, like the government e-system platform latvija.lv went down when the COVID vaccination signups became available, which was problematic, given how many other systems depend on it as a login provider. We've also occasionally have had the national tax system go down in the dates when people can start submitting their tax reports.

In regards to security, you occasionally see all sorts of problems even in large enterprise codebases, a recent case was the Krete education system source code (not in my country, just a relevant example), where for some reason code like this was running in production: https://www.reddit.com/r/ProgrammerHumor/comments/yqp2o6/our... here's an article about it https://hungary.postsen.com/trends/86603/The-CRETE-study-sys... (though it got hacked in a phishing attack)

Also, there have been cases of problematic deployments bringing down entire orgs, Roblox had their core business be halted for a while a year ago because of problems with their Nomad cluster, which resulted in a 3 day long outage: https://roblox.fandom.com/wiki/2021_Roblox_outage

The Knight Capital Group also lost more than 400 million USD in less than an hour and went bankrupt because of a bad deployment: https://dougseven.com/2014/04/17/knightmare-a-devops-caution...

I've personally seen systems that have gotten so complex that they're a liability - if there is no full test coverage (as close to 100% as you can get) then you're usually under the risk of wrong calculations or behavior in the system, which may or may not be possible to roll back, depending on the nature of the system. Sometimes it might not even be easy to detect, but the implications can be pretty great, especially if you're dealing with finances. Or having people depend on the systems and not having a business continuity plan were they to go down, due to how integrated things are in our current society.

There is plenty of software out there and everything around that software (other code that determines how it will run, how it will scale, how it will be secured) that has plenty of potential for creating lots of negative value, all the way up to making organizations evaporate, or causing lots of suffering for numerous people. One doesn't even have to consider something like the software inside of planes or software that controls train routing or power distribution networks.


The important aspect of this article is Cognitive Complexity.

Of course, everything being equal, more code leads to more complexity.

In this context, complexity is related to human capacity to collectively read/understand/maintain a code base.

I've been writing software almost non-stop for close to 25 years, and I still don't understand most of the code written by others.

I especially struggle to read anything with templates, generics, lambda...

But I can read straight C code, code written by Fabrice Bellard or John Carmack in the early days of id software.


Interesting. This reminds me of the "simple made easy" talk by Rich Hickey, about complexity vs difficulty: https://paulrcook.com/blog/simple-made-easy


React developers take notes,random lambdas are dumb.


React is self-similar which to me makes it simple. The framework gets out of your way. I can’t say the same about, say, Angular.


If a codebase requires more than one person's mind to reason about it then the complexity and bug count grows exponentially per extra mind.


I think this is a bit of a small-scale conway's law-style effect. You shield your ability to reason about the code from the complexity of the code of others by abstracting around it, thus increasing the total complexity with an ad-hoc abstraction layer.


Essentially for every layer of indirection, a reset of complexity happens at the expense of unknown unknown's creeping into the simulacrum the layer is modeled as. At least, that's how I see it. :)


Citation needed. As far as I know the best available evidence (which is still pretty weak) says number of bugs is correlated only with number of lines of code, regardless of whether they're simple or complex lines.


The experiment should be pretty straightforward to test.

X lines of simple code containing a bug vs X lines of complex code also containing a bug.

How long does it take for average joe coder to find it in both cases? Is the experiment really needed?


It's not so easy. In my experience, most bugs that aren't extremely trivial are not obvious locally. A function, whether simple or complex, will look totally fine, but it's in fact buggy, because it violates the implied expectation of the caller, or makes the caller violate implied expectations of their caller, etc. Decoding a complex line of code isn't the hard part here - narrowing down the mismatched expectations and identifying where to apply the fix is.


Splitting code into functions isolates blocks of their context of what situations they are called in and what parameters are passed in. In my experience, Transcribing code into notes, where struct fields are inlined as indented blocks into their surrounding types, and function calls' interiors are inlined as indented blocks into their call sites, restores this context, places lines on-screen in the order they execute, and enables casual reasoning. (You still won't know data invariants without asking the original programmers, but you can guess and check at this point.) http://number-none.com/blow/john_carmack_on_inlined_code.htm... advocates for a style where code is inlined in the actual source code whenever possible.


I'm heavily in favor of inlining over tiny little functions, and the transcribing you describe is something that should, IMO, be done automatically by your IDE, on the fly, as a different rendering mode. It's unfortunate that there's no such feature for any of the languages I know.


I don't think that's a valid way of framing things, beacuse it's not like the amount of review time available is proportional to the number of lines of code.


>I've been writing software almost non-stop for close to 25 years, and I still don't understand most of the code written by others.

I think you nailed it. In my experience, the increase in complexity isn't necessarily inherent to the amount of code so much as the amount of abstraction used as code size increases.

I've worked on some large codebases that only used simple, well-understood and/or well-documented abstractions which didn't feel as nearly complex as other codebases with abstractions that were more complicated than they were worth.


Yes. All modern systems run a huge amount of code if we count the underlying microcode, operating system, runtime libraries, 3rd party dependencies and the application itself.

The thing is not the total size but the boundaries and the understanding of each layer and how manageable each layer is for the humans modifying and mantaining it ... which is, of course, the total cognitive complexity.


The older/experienced in the industry I get the more I don't believe software engineering research.

If it's in a uni- expect for the researchers to be biased towards the practices that are advocated in their academic text books on the subject.

If it's an organization: Expect for it to be biased towards what they want for that language (ms for supporting c# for new features or moving people away from old features)

---

I saw this in the anti-TDD/unit testing/Testing papers. They're making claims based on a flatten view of development and aren't capturing what happens on the local machine where those are intended. Additoinally they don't consider the externalities about how you approach the code base.


Link to original paper is dead. Working link:

https://dl.acm.org/doi/10.1109/32.935855 (15 pages without appendices)


In my experience clean code has nothing to do with its size.

Some people write clean but long code. Others write super compact code that nobody else can decipher (except the compiler I guess).


I admit that it took me having to look at "clever" short code that I wrote years ago to really take this lesson home. Now I get that code has to not only be correct and concise, but also it needs to document to the reader it's function. That often means being more verbose than it could be, and being broken into obvious discrete steps.


On the flip side, there's code that will never be easy to read. Many algorithms and data structures simply don't break down to anything that is easy to comprehend.

Like even something as freshman as mergesort falls in this category where you're sort of squinting and wincing trying to figure out how p, q and w relates to i, j, i0 and j0.

Sure you can say let the PhD students write that code and put it in a library and we'll just worry about `sort(array, start, end)`, but that's just sweeping this reality under the rug.

Truth is if you're not willing to venture into this territory of non-trivial algorithms once in a while, you're seriously hobbling what you're able to do.

The trick is knowing when it's the right thing to do. It's the closest thing we've got to dark magic. Dangerous and unpredictable, certainly, but also immensely powerful.

Entire industries are built around someone one day sitting down and developing a piece of non-trivial algorithm. The big data boom of a few years ago, block chains, machine learning, ...


Certainly the best code is simple and easy to read, and yet still concise. But even "clever" code can be better than truly bad code.

I'd wager that those of us that frequent HN probably see less of the sort of truly bad code though since it is usually the code written at lower paying jobs and fields, but which possibly makes up more LOC than all the "ok" code.


"It reads like newspaper" a man once told me about jQuery. A great compliment to that code and something I stive for with each new line!


Myself, I lean towards compact code, but the team pressure is towards verbose code, and - hopefully - the compromise ends up being at the right level[0].

The tricky thing is, "clean code"[1] ends up hitting against the limit of human working memory. Past some point of verboseness, that "clean but long code" gets very hard to work with, because the meaning is spread too thin, it doesn't fit in short-term memory anymore. Conversely, "super compact code" may be hard to decipher, but you know the whole thing is contained in the handful of lines you're looking at. As a crude analogy, it fits in L1 cache, whereas dealing with "clean code" means chasing pointers.

--

[0] - FWIW, I rarely have people complaining during code review, and more often I receive compliments for readable (and documented!) code, so I feel tentatively confident I'm hitting the right balance. I think the trick is to not be defensive, but instead to pay attention when someone signals the code is too difficult to understand, and adjust accordingly.

[1] - As a somewhat generic term; the advice from the book "Clean Code"... isn't very good.


The trick is to make clean layers with distinct responsibilities. And magically you have no to keep all details in a memory. Unfortunately there is no good formal method to make clean layers. Currently we practice it via bruteforce through experience.


Layering alone isn't enough, for (at least) two reasons:

- Usually the work crosses several layers in at least one place - e.g. implementing or modifying a feature will require you to touch implementation and interface layers. If those layers are spread too thin, they won't fit in the head.

- Too much "clean code" will make a single layer no longer fit in working memory even at an "overview" level. This is the most common issue in my experience; a problem created by adherents to the idea of "lots of short, simple functions". Even most trivial feature makes you jump between half a dozen functions, and IDEs don't help there the way they should (i.e. by allowing you to view - and perhaps edit - code with select function calls inlined in place).


> Usually the work crosses several layers in at least one place

You are doing it wrong then. It's basically a violation of SRP. Your changes should touch only one layer. If it's not, you have fragile abstractions.

I agree there are a number of cases when you introduce a new thing you should thread implementation through all layers. But not for maintenance and bugs (common case).

> Too much "clean code" will make a single layer no longer fit in working memory even at an "overview" level.

But why you need to keep it in memory? Do you keep all OS APIs? Or standard library? They are other layers you use and have no issues with broad interface. Your application layers is the same thing.


> Others write super compact code that nobody else can decipher

That is not clean code. period.


> Does the number of bugs grow linearly with code size? Sub-linearly? Super-linearly? My gut feeling still says “sub-linear”.

My loose reasoning says it has to be linear (like interest generated on bank accounts). Otherwise you could simply merge two large codebases to squash bugs (in the case of sublinear bug growth) or divide a large codebase (in the case of superlinear bug growth).

ALternatively, in a sublinear-bug-growth codebase, you could just keep adding and adding to it until eventually you'd add no more bugs.


Bugs grow linearly when writing them, not after they are written.

> divide a large codebase (in the case of superlinear bug growth).

I do almost exactly that when approaching a problematic system: refactor aspects of it as pure components or separate services in order to isolate behaviours from each other.

Even just what you describe - breaking a large codebase into two smaller (working) ones - exposes an interface that can reveal failed intentions. It's just usually an incredible bitch to do, unless you initially refactor the parts to minimize internal state and the types of interactions the two parts have (ie - make them more 'pure').

Decidedly super linear, IME.


You can't just divide a code base that's entangled. If you disentangle it, you'll remove some bugs in the process. My feeling is that it's super-linear for this reason.


Completely agree. And my favourite add-on is if it’s broken apart such that one half of it ends up being a library, adding a second consumer to that library will tend to further reduce bugs, even more so if the second consumer is written by a different team. (Eventually, at least, because the second consumer is going to find bugs so the start)


Bugs definitely grow super-linearly, at least exponentially. They scale with the number of interactions between lines, which grows exponentially. (That, in turn, scales with the number of unforeseen-behavior-we-don't-like, which is generally what is meant by a "bug".)

To be sure, well-written code decouples the different components so that distant pairs of lines interact as little as possible, and thus it's more difficult for unforeseen constellations of factors to arise. Well-written code also minimizes the possible values that are used and have to behave correctly. But all else equal, it will be exponential.

Your intuition is correct, that you can significantly reduce bugs by splitting an N-size code-base into two N/2-size code bases. But there's only so much you can split it down while achieving the desired functionality. Plus, there's overhead to constantly switching between codebases.


I don't quite follow this logic. If one manipulates a codebase in the ways you describe (merging with another, or dividing a larger one), the result doesn't even have to be a working program; how can we compare the result with the original?

The real answer is probably that it might depends on how tightly coupled the code is, how much test coverage there is (assuming that tests are included in code size), the quality of the authors who wrote the code (this might be circular), maybe even on the technology used, or what you consider distinct bugs, etc.

My intuition leans towards super-linear, but only slightly so -- large codebases tend to have many authors (many of whom may have since stopped working on the project) and be old.


> Does the number of bugs grow linearly with code size? Sub-linearly? Super-linearly? My gut feeling still says “sub-linear”.

Depends on whay you're doing; are you implementing features, refactoring, or fixing bugs?

Generally, I'm guessing that the number of bugs will be linear, but the number of bugs in common code paths will be logarithmic.

Or perhaps that the likelyhood of a bug in a given piece of code decays exponentially with the age of that code.

There's also the issue of developer familiarity with the code base, and where in the code base the changes are occurring.


My gut feeling is it grows linearly with any actions over codebase that are not accompanied by both testing and real usage. Well, until tech debt comes into play.

But there are many ways to make it non-linear, e.g. one can build a nice high-level structure which becomes irrelevant when goals shift. Or one can sign up for a paradigm which requires better developers and deeper understanding.


I think it would make sense to imagine the codebase as a graph of some sorts, and the number of bugs as a function of both the number of vertices and edges.


Discussed at the time:

Size is the best predictor of code quality - https://news.ycombinator.com/item?id=3037293 - Sept 2011 (123 comments)


The worst code:

- Duplicates large, dumb patterns in slightly different ways throughout

- Lacks coherent naming or compartmentalization: it's just a pile of shit (statements or expressions)

- Lacks testing

- Is difficult and unpleasant to read

- Is ugly: not formatted, inconsistently formatted, and lacks any sort of regularity

- Combines many functions into god objects and god functions without clear intent. Who knows what any of that crap is supposed to do, or if it's even doing anything correctly

- If not lacking comments, contains useless boilerplate comments that take up visual real estate

- Verbosely reinvent low-level constructs poorly

- Live as numerous, fragile, interdependent pieces scattered across many files and time (runtime)

- Are configured in unclear and confusing ways that create incompatible states that break

- Includes code that is never used

- Contains fragments of experiments that were never discarded

- Combined and abstracts over opaque units of other junk that doesn't behave properly

- Doesn't sanitize untrusted input

- Doesn't give the user sensible feedback

- Makes the user do unnecessary steps themselves

- Acts in unpredictable ways

- Tries to solve a problem that never needed solving to begin with

- Prematurely optimizes the previous


Hmm, a third of these points are subjective value judgements about the aesthetics of the code.


Which? What specific point are you making?

Usable aesthetics matter for reduction of cognitive load.

If lengthy lines of semi-similar code aren't aligned to visually see they are correct, that's less maintainable code.


"Subjective" doesn't mean "wrong". Do you only want people to post things they have, like, data for?


A measuring stick that is of different lengths depending on who wields it is not a good measuring stick.

Different people may use the same list to arrive at contradictory conclusions about the same code.

You, following your preferences and customs, may look at your code and think it is great.

I, having different aesthetic preferences and being accustomed to different paradigms, may look at it and think it is the worst code.


Pointing out that subjectivity _can_ be wrong is unhelpful and uninteresting. What about the actual list they posted? Do you think it is wrong? Then criticize it in substance, not in principle.


I found that when software programming, one of the best practices was to work hard to avoid writing code (and this isn't about cleverly writing 3 line functions in one line or such).

The goal was to really think about what code is really necessary, and how the thing can be designed, structured, or architected to require the least writing of code.

Of course, in the end code must be written, but I always keep in mind two thing:

Where do bugs live? In code - every statement is more habitat for bugs.

What is the fastest optimization of any process (software or physical)? Eliminating steps. An unnecessary step eliminated is a 100% optimization.

This thinking approach is valuable in other areas too. As I was learning about club sportscar racing, a coach asked me "what are the things you do that slow you down?". I thought and started an answer about brake & shifter footwork while entering a turn... and he said "No, no, the really big basic things"... "like braking and turning?"..."Right; braking and turning. What does that mean? That you should avoid braking and turning at all times".

Obviously, the point is not to be taken literally as you'll run out of road at the end of the first straight and have a very slow lap time. The point is to really think about how every little bit of applying the brake and steering inputs will slow you down, and if you can eliminate or minimize even a few, you'll be ahead.

And in a completely different field, I remember an interview of a famous film director (I don't remember who) who said that in a great film, every syllable had a specific purpose. If it didn't serve a purpose to the plot or character development, it should be cut.

Obviously, we must in the end write code to get the machines to do anything; but do we really have to write that next few statements, or can that goal, feature, or function be met otherwise with some extra planning and thought?


I remember my univ. teacher of statistics: he once made a research on "the best predictor of the price" in cars, and the answer was: it's weight :)

so, he said, just buy the heaviest you can get for the money you want to spend

is it a bit related?


I agree with the premise of this article. You want a program to have as few lines of code as are required to satisfy the needs of stakeholders (especially customers and other operators of your software).

The size of the solution must not exceed the size of the problem it solves. You need to focus on the essence of the problem and make compromises based on what is most important. Keeping the lines of code count low is a worthy target.

There is a balance to be reached but devs today are far too permissive when it comes to complexity creep. It should be setting off alarm bells much sooner.


The article's title refers to "code quality", but the word "quality" never appears in the body. The article is about the number of bugs, so I think a more appropriate term is "software quality" (more specifically "software functional quality", per Wikipedia; but of course it would be clearest if the title just said "number of bugs").

To me, "code quality" means something very different (well-written, readable, maintainable, etc.; Wikipedia calls this "software structural quality"). According to this definition, "code quality" may be correlated with the number of bugs (hopefully with a negative correlation!) but the two are very different things.

And I'd like to boost @kragen's comment (https://news.ycombinator.com/item?id=33566329#33579042, toned down):

> haldar's summary is ... incorrect ...

> the paper finds that faults (and wmc (methods per class), cbo (coupling between object classes), response for a class (rfc), number of methods added (nma), etc.), correlate to class size (non-comment source lines of code), not program size


If you step back for a moment and think about other fields like writing a book, it stands to reason that no matter how good your editor and electronic spelling/grammar checking tools are, the longer the book, the more likely there are to be spelling/grammar errors.

Like these studies, I'm sure there are a bunch of other metrics/factors you could pull out to correlate with errors, but it makes sense that as the document gets much longer, it's a lot harder to be so thorough with checking.

I do agree that the more mature a thing is, the more likely it is to be correct. This seems like a corollary to the Lindy effect [0]. As something matures and stays in use, it's likely that the problems with it are being found and fixed, otherwise it would die/disappear from disuse due to the issues. With the book example, each time there's a new edition, there's a chance for errors to be fixed. Depending on the behind the scenes process of updating the text, new problems could be introduced, but in general, you'd expect the total errors to trend down over time.

As a caveat to all this, I've often remarked that no matter how many times I review and revise my résumé over the years, I keep finding typos. So who knows...

[0] https://en.wikipedia.org/wiki/Lindy_effect

There


Size correlates with bugs.

It also correlates with functionality.

I've used grep and ps a few times this month. Other than that, I'd be surprised if anything I've used recently is less than 100k lines, except for devices with embedded firmware.

I'd rather have buggy code than code that doesn't do anything and is more work than just doing the task manually without code.


Code does things. Code provides features. Therefore, the more lines of code software has, the more features it offers. This is another way of saying that number_of_bugs / number_of_features is a constant. Any added feature has some constant probability of introducing a bug. More features, more bugs. Pretty obvious stuff, if you ask me. Software with zero lines of code does nothing at all and has zero bugs.


"Therefore, the more lines of code software has, the more features it offers"

That assumption is radically untrue based on most of the code I have seen.


Call them what you want. Each line of code means and does something. If code were a novel, this research could be summarized as “the number of typos and plot holes in a novel are proportional to the number of chapters, paragraphs, and sentences, as everyone knows—but they’re also proportional to the number of words and characters!” Yeah, no kidding. The author is trying to convey information. More words, more information. Holds true across all natural and unnatural languages. Reduce entropy and you reduce the possibility for errors. Though that doesn’t make for a very interesting novel.


you don't think there's much variability in features per loc?


Code changes have at least a linear amount of bugs on average, but have a large random component and selection bias - A key part of being a good developer is learning how to read code and notice/realize when it does have a bug.

Even if the number of bugs is largely random each change set, repeatedly selecting the most obviously buggy code to change will reduce the number of bugs over time. This can apply at a number of different points and scales: The developer writing the code for the first time and testing it locally themselves, the code reviewer looking at the code, other developers reading and editing the code while working on their own features, other developers choosing which library to use for a given function and which projects to support, customers choosing which product to use and business to support. Each of those layers selects against bugs over time, eventually bringing it sub-linear to the end users.


I wrote a little on codebase size and quality, and what to do about it, back in 2019. Some might find it interesting.

https://blog.eutopian.io/winning-systems-security-practition...


I've golfed my assembler for the Nand2Tetris assembly language down to 714 bytes of Python3. Sure I use some cheap tricks (like single-letter variable names, or writing 16384 as 4*7 to save a byte) but more surprisingly, it forced me to write in a completely different style.

When your whole program/algorithm comfortably fits on the screen it affords the unique ability to program "in the large" and "in the small" simultaneously. Bugs can't hide. My assembler contains an "elif" without a closing "else", which would normally bother me. But since I can see it all at once I can judge it in context and weigh its purpose against my aesthetics.

I'd recommend golfing some toy program down until you are sure it is as small as it can possibly be, it's a cool experience.


I once made a space ship game in 255 bytes for an uni assignment in a toy instruction where you could do whatever you wanted. The space ship and obstacles were interleaved in the code and you had to play it trough the debugger view.

I feel I would have loved programming in the 70s ...


That sounds impressive! Nice work.


Page 6 answers the question of what are the best predictors of code quality [1]. At least in terms of critical defect density.

This has been known for a long time. It's just not widely known because the data is tough to collect yourself, and the already collected data costs money.

Personally, I like that model because it's broadly applicable, actionable, and based on 35 years worth of industry data.

The blog reaches a self-admittedly partial conclusion, and it's partially correct, so the title should not be taken as a complete statement.

[1] https://missionreadysoftware.com/wp-content/uploads/2020/01/...


I think use/time of the code needs to be factored in as the more the code is used, the more bugs are exposed and fixed, such that old actively managed code can be far less buggy (if all goes well and the architecture is good). Less code that does the same job is always better.


> After my last post, which hypothesized a relationship between the total size of a program and the bugs in it, I was led to this paper, via this blog post, via this comment. This, by the way, is my favorite thing about blogging.

"this paper", "this blog post" and "this comment" are all hyperlinked. "this paper" gives a 404. "this blog post" gives what should be a 404. "this comment"? Well, that a HN comment linking back to the now-missing post:

https://news.ycombinator.com/item?id=2991224


I've found this to be absolutely true throughout my career.

The only adjustment I'd make is that size should be measured after some normalization pass (eliminate comments, whitespace, make variable names the same length, etc)


"If I had more time, I would have written you a shorter letter."


Code size is also a good (but not great) predictor of development velocity. If you need to write fewer lines of code to do something (assuming standardized formatting so no code golf), it's probably the case that you can write that code faster, and ship features faster. And of course, some languages simply enable to work on such a high level of abstraction that you write much less code (e.g. http://www.paulgraham.com/avg.html).


There are also two dimensions to code size, one being source code size and another is executable size/number of instructions executed. This path of thinking brings up the connection between optimization and correctness.

In particular if adding source lines describe the program more specifically and accurately then that should enable compiler to produce more compact/efficient code and I'd imagine also contribute to correctness.


It was reported in Peopleware that bugs are somewhat consistent across LOC so someone who implements a solution in 10x fewer LOC also has 10x fewer bugs.


It's probably better explained by the fact that simple problems are more likely to result in simple and smaller codebases, and the simple code and simple problems means less edge cases which means less bugs.

A lot of complexity comes with the fact that people code for the future with "well, we have to make this a platform solution in case they want to change stuff in the future, so let's add a configuration object, a datastore to pull it from, and a cache to limit database hits, with a cache eviction policy that's also configurable via ..."


It's important to note: shitty programmers can achieve what a great programmer can but with 10x lines of code copy pasted until all found use cases are covered, and will take 10x the time, and will need 10x the QA team.

So yeah, more LOC == bad, relative to the code's needs.


My prediction is that a large number of comments on this post will be programmers desperate to justify writing more code.

I'm not a good programmer, but the one thing I think I have over bad programmers is the awareness that adding more code is never the solution.


Nope, it's how you use it!


breaks the back button in mobile safari. cute.


Why do people do this? I can’t think how it’s accidentally introduced, it has to be on purpose but why


Chrome is the new IE. No one tests on any other rendering engines these days (mostly slight hyperbole).


Ah but I was mainly wondering why that page would touch the history at all. If it was doing something complex I could understand there being some compatibility issues, but I don't see any reason this page has any business meddling with history.


What's to say this isn't just an emergent statistical phenomenon?


The ratio of tests to code is often a good benchmark of how good a codebase I'm dealing with.

The caveat is code bases with targets on coverage. The ratio becomes less meaningful then.


No, size is the best easily understood predictor of code quality.

You could get a far more accurate analysis with partial compilation and graph processing algorithms.


yes, but I started putting dummy files in my apps because prospective employers thought the things I was working on were too small to be experience, based on the reported file sizes in app stores

no, I didn't dodge a bullet, I wanted money and they were fine work environments and I never dealt with the recruiters or hiring managers again when on the jobs


Oh, God, please, no! If contractors will start put all their spahetti code into as few lines as possible - we are doomed.


Not my experience at all. The quality and experience of the team is IMHO the best predictor of code quality.


So, smaller programs do less, so there are less potential bugs. They are also easier to comprehend by developers.


That's not what i recall the 2001 paper saying.

Iirc, it said if you pick a software complexity measure that it does not do a much better job than just simple lines of code.

Back in those days, metrics were real and were used to try to help create and maintain software code quality.

For instance there were some groups that would not allow code that had a McCabe cyclomatic complexity greater than x to survive. Course the easy way to skirt that is to break the module into however many modules you need to get them each under X.

My take away from that paper was that the best life hack I can do would be to reduce whatever metric was being used to measure quality each time I checked in.

It is shockingly easy to edit existing code to reduce the bulk while making it clearer. Oftentimes, that is simply just removing dead code. Sometimes it's modifying ridiculously convoluted code.

Most the time, reducing the lines of code also reduces other complexity measures if you do it with care. Of course, my measure was never a target so I was never led to perverse incentives and dysfunctional behavior.


Meanwhile rumors of twitter engineers being judged by how much code they're writing...


I suspect that only code bases of a certain quality can achieve larger sizes.


I would say this equally applies to legal law.


This is why tinygrad.org is tiny.


This is why I only code in APL :P


Zero lines means Zero bugs.


Doesn't it throw null exception when you try to use it though?


Well, that depends. There was once a commercially viable zero-byte program:

http://www.smashcompany.com/business/a-program-of-zero-bytes...


Meta observation: At least on Mobile Safari, this page breaks the back button. I had to spam press the back button like 10x in order to come back to HN. I absolutely despise websites that do this, and immediately feel a bias against whatever the site is selling or trying to convince me to believe.

RE: the article itself

I’ve been on projects where we eventually shipped the most poorly written code (variables with ambiguous naming, methods that are several hundreds of lines, classes with 5-10k lines, little or no test coverage). This code was baked into a consumer device, and we shipped over 16 million units over almost the span of a decade.

I’ve also been on projects where the code was practically a work of art. By the time we were “done”, the market had changed from under our feet. What we were doing wasn’t so interesting or novel anymore. We never shipped.

It doesn’t surprise me that more code == more bugs (or an increasing probability), but it’s good to keep in mind that value is what we care to produce. Everything else is a cost, and spending too much on any particular cost can take the whole enterprise down the ravine, regardless of good intentions.


> Meta observation: At least on Mobile Safari, this page breaks the back button.

The UI is somewhat hidden, but a long press on the back button will open a pop up menu, which makes it easy to jump back two pages.

Other than that, it is totally a bug in the website. I wonder how many lines of code it has??


> I wonder how many lines of code it has??

It's WordPress, so right about all of them.


haldar's summary is completely incorrect, as you would expect from someone who says 'tl;dr'

the paper finds that faults (and wmc (methods per class), cbo (coupling between object classes), response for a class (rfc), number of methods added (nma), etc.), correlate to class size (non-comment source lines of code), not program size

it only looked at one program so it could not measure any effects due to program size

moreover it wasn't measuring code quality in the sense of 'defects per kloc', it was measuring defects (whether a bug had been detected in a class in the field or not)

stripping away the acronyms, what they found was that classes that contained more code more often had at least one bug, and also had more methods, but that having more methods without having more code didn't make classes significantly more likely to have a bug

and similarly for the other complexity metrics like how many different methods a class calls and implements (rfc)

this is unsurprising, since things like the number of lines of code in a class and the number of different methods in a class are just alternative metrics of class size, and everyone knows that in general more code means more bugs

that's why we measure code quality in defects per kloc and not total defects. the paper didn't even try to measure code quality in that sense

that doesn't mean the paper is bad. if the paper's authors are correct that many other papers have failed to control for size in their defect metrics, they have identified a serious shortcoming in the existing research literature; haldar merely totally failed to understand them

so the paper haldar tried to summarize doesn't measure either program size or code quality

(some of their references did look at program size tho)

why are none of the other comments pointing this out

are they all just commenting without having read the paper

now i am sad


>as you would expect from someone who says 'tl;dr'

...What? This is hilarious.

How is using an acronym to provide a brief summary of something indicative of correctness?


> How is using an acronym to provide a brief summary of something indicative of correctness?

I’m struggling to understand how this is the point that’s most worth debating here. What do you propose we’ll learn?


it turns out they thought they were being criticized so they got defensive, explaining their otherwise shockingly aggressive and even mendacious comment thread here: https://news.ycombinator.com/item?id=33567361

not sure that's a good excuse for their total failure to contribute anything substantive though


'tl;dr' is an abbreviation for 'too long; didn't read'

people who don't read long things sometimes know a lot about things like how to weld, how to comfort the grieving, and what makes them happy, but they are always profoundly ignorant when it comes to book learning

that's what happened in this case, where haldar's summary claims that this paper shows that a variable they didn't attempt to measure is the best predictor of another variable they didn't attempt to measure

evidently he just looked at the diagrams and guessed what the words meant


>people who don't read long things sometimes [...] but they are always profoundly ignorant when it comes to book learning

You seem to have fundamentally misunderstood how this acronym is used.

When provided by an author, it doesn't mean that the author didn't read... It's literally just a substitute for "Summary:". An indication to the reader that if they find the article too long to read, they can read the immediately following subsection to get a summary of the article.

Do you also think that people who say "Summary:" are profoundly ignorant? Or are you just against acronyms?


probably you haven't noticed this, but using 'tl;dr' instead of 'summary' or 'abstract' is a group identity marker by which the author shows solidarity with their (presumably profoundly ignorant) readers. it's the same sort of gesture as describing the paper's authors as 'eggheads' or 'boffins'

and evidently in this case the blog post author really didn't read the paper they were 'summarizing'

i find your contributions so far to the discussion of this paper profoundly disappointing


>probably you haven't noticed this

>their (presumably profoundly ignorant)

>i find your contributions so far to the discussion of this paper profoundly disappointing

I'm not sure why you are so hostile, and why you think everyone except yourself is profoundly ignorant, but it's clear that you have some sort of world view where you're superior than anyone else you interact with. I doubt anything I say would convince you otherwise.

tl;dr: I found this conversation disappointing, too.


i frequently interact with people i admire and who know more than i do (and sometimes they are the same people), and, as i said above, even people who are profoundly ignorant about book learning are often wise about many other things

rather than being hostile, i have taken the time to explain in more detail the things you asked about in my previous comments because they were unclear to you

i am surprised that you have stooped to launching personal attacks on me in this comment, since so far i have criticized only your low-quality comments (and the original blog post) rather than any presumed personal attributes of yours

try to do better please


>i am surprised that you have stooped to launching personal attacks on me in this comment, since so far i have criticized only your low-quality comments (and the original blog post) rather than any presumed personal attributes of yours

You've called me "profoundly ignorant" several times in this chain. How you can say this with a straight face is baffling.


i haven't called you profoundly ignorant even once. please do not lie about what i have said; there is no point in doing so in any case because anyone reading this thread can look half a page up or click the 'context' link and see that you are lying

i said that people who don't read long things are profoundly ignorant, at least within the sphere of book learning, though often not in other ways. that's not a personal attack on them, it's just an obviously true statement if the phrase 'book learning' refers to anything at all

you haven't said even once that you don't read long things, so there was no reason for me to even suspect that what i said applies to you, even if it were a personal attack (and, well, some people are sensitive about being ignorant because they're used to being criticized for it)


From now on I will strive to only write "abstract" instead of "TL;DR" so I can avoid showing solidarity with the presumed profoundly ignorant. It's a matter of principle that I must appear to be the most intelligent person in any room I enter, virtual or physical.


maybe you could show solidarity with them in a way that doesn't reinforce a dichotomy between them and people who read books

for example, by emphasizing the praiseworthy attributes you have in common rather than the misfortunes that have befallen you and the self-destructive choices which, in cases like this one, perpetuate those misfortunes

i mean it might be useful to read a book from time to time, or at least a 25-page article, and that's less likely if you turn it into an identity threat

we're all born profoundly ignorant but we don't have to stay that way

if you're the most intelligent person in the room maybe you're in the wrong room. this can easily be reversed to provide a plan of action if being the most intelligent person in the room is what you value most. beware, though, some people are smarter than they look


Some of us know how to weld, how to comfort the grieving, and have good book learning. Still trying to find the balance between happiness, satisfaction, and pleasure.


agreed, i'm not trying to say it's a zero-sum sort of thing, just that using 'tl;dr' is not an indication of profound ignorance in all possible ways, just the forms of ignorance that result from not reading long things


If you follow the roadmap to software quality of course small is better:

Make it work Make it right Make it fast Make it small

(Classically attributed to Kent Beck, but he left off the make it small which I added)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: