My experience is that it's often not the developers at all that cause a program to get bloated, but constantly changing client/user expectations. Or in other words, a program is developed and it starts out simple. Then, one of the following happens:
1. The user/client comes across edge cases, and has a very specific thing they want done in that situation. Unfortunately, it's the complete opposite of what the system would usually do, so said case has to be specifically patched to act in a certain way despite making the codebase a confusing mess to figure out. So one tiny part of a certain web page gets different padding or line height because the client/user thought it looked better in that one situation, and no one thought to say it was a bad idea.
2. The project's spec was wrong and hence it was the wrong thing to be building in the first place, but no one liked the idea of starting from scratch. So you end up with a program that's meant to do one thing but has been crudely hacked into doing something completely different and hence half the features are utterly useless.
So it's not the developers fault in these cases, its management not knowing what the team is actually meant to be building or spending a lot of time trying to provide exceedingly random and useless 'nice to have' touches.
The way software is completed is pretty crazy today, little time for research and development and a lack of focus on quality, but focused on short term changes that cost long term technical debt. Estimation is hard and project plans are only ever about 60-70% right.
There is a big problem with not allowing sliding dates as well. When software is completed one day early that is a pile of crap it is lauded short term and painful longer term submerged in technical debt, then when more time is taken up front for quality it is short term pain for long term gain later on the maintenance phase. The problem is the latter is hard to do with development schedules not set by developers.
Note: I accidentally hit downvote as my mouse glitched and wish I could reset that.
I work on a very large and evolving code base. And part of the R&D process is to throw difference features at the product to see if it improves the health. But at the same time we can't break prior features.
It isn't that we had a bad spec. It's that the spec changes as we discover new methods and try out different ideas to continue towards incremental improvement.
I personally have had to forcibly break a script and tell other developers to stop using a particular function because it was archaic and wasn't used in production for several months. And yet, somehow, a month later people would come to me complaining I broke their scripts since it was relying on an old feature we had long depreciated in favor of a much better version.
We can probably conflate this type of problem with the second you mentioned. If the changes are that much of a problem there is probably something wrong with the software architecture. From a programmer's perspective it's easy to forget that the code exists to serve the user's experience, the user's experience shouldn't be crafted to serve the code.
You can't blame that on the "management" boogey man. It's unreasonable to expect managers to predict future customer requirements. If they guess right it's probably due to luck and not repeatable. Agile best practices of having product owners in close touch with customers, and getting frequent customer feedback through continuous delivery, can mitigate this problem but not prevent it. Software evolves and grows over time; that's the nature of the beast. In some cases it can make sense to go back and remove functionality that's no longer needed but that's tough to justify when you have limited engineering capacity and customers are offering to pay you actual money for specific new features.
While changing requirements/expectations are part of the problem, I find that projects based on increasingly complex technical stacks and frameworks that impede functional changes are another.
Those bloated technical parts tend to be focused solely on supporting (and providing flexibility for) non-functional requirements. Is it then a surprise that functional changes can't be handled effectively?
Perhaps that is because developers really focus on technologies, not how to represent a complex business problem or systems into a model which is represented in code.
Representing the business problem in a system, including changing functional requirements seems to be both a lost art, and a forgotten need.
I wouldn't just blame devs, I'd blame business, as much in terms of the dynamics as well as the individuals.
From a marketing perspective it is easy to see that "Adding Feature A will help us sell to customer B." It is not clear (and probably not true) that feature A will cause us to lose a sale to customer C, but the cumulative effect of a huge number of features can lead to "bloat" and "unreliability" that can cost sales to customers Y and Z and drive the cost and risk of development so high that it stops.
I think of Adobe PDF as an example. People criticize it as being "closed", which is not true, it is one of the few ISO standards which you can download for free and Adobe even scrupulously documents how their products deviate from the standard. However, it is insane how much stuff has been added to it beyond a document format (i.e. exchange of 3D models) such that when I am trying to control the size of the C: of machines I control, Acrobat is the Creative Cloud product that is the most universally useful AND one of the absolutely largest thus making it the hardest decision.
The arguments against it being "closed" and "bloated" then crowd out the real criticism you'd have if you actually read the ISO standard, such as the fact that you can't draw a circle in PDF. (I don't know what happens if you 3d print a wheel made out of 4 bezier curves that look like a circle, but I am sure some of you will find out.)
> when I am trying to control the size of the C: of machines I control, Acrobat is the Creative Cloud product that is the most universally useful AND one of the absolutely largest thus making it the hardest decision.
your point about bloat being business driven rather than devs is accurate.
one point though:
"Adobe even scrupulously documents how their products deviate from the standard"
that is being a bit generous. Even though the pdf spec is around a thousand pages there are still many areas where it is vague or allows multiple different implementations.
in a lot of cases the only way to find out what acrobat does (and therefore what the users expect) is to load the document and see how it differs from what you expected.
There are actually a number of really good PDF readers, in the form of general eBook readers. E.g., FBReader and PocketBook, both Android apps (FBReader is also available for other platforms).
A lot of others have mentioned clear reasons that bloat happens in the normal trajectory of software lifetime: increase in requirements, bugfixes, etc.
But sometimes constant employment on a project just creates bloat, too. Refactorings that are intended from the beginning to reduce complexity either don't end up delivering, or they are never completed, leaving systems in a half-migrated state. Most systems I am familiar with are in a perpetual state of transition between the old way and the new way. The more complex the system, the harder a migration is. The harder a migration is, the more likely it is not to be fully done. The more incomplete migrations, the more complexity. On and on.
It's why eventually we should keep trying to reboot software systems with from-scratch, feature-lean replacements.
People are very bad at deciding they need to be made redundant. Hence the phenomenon that "software is never done". It actually can be done, but you won't get the developers to agree to that!
Software that becomes "bloated" is software that is successful and in use. Typically, the users want more and more functionality, because they use the system for more and more. This is not bad.
Software that is not used (for what ever reason) is abandoned, and therefore does not grow.
So there is selection bias too - the most notable software is the successful cases, and they become "bloated" because they are successful.
(what the author talks about is not bloat, it's clever hacks that also make it much harder to understand the program)
Similarly, "legacy code" is a pejorative that means "Code that I didn't write and don't understand yet because I haven't had a chance to look at it for long enough, which I would really like to rewrite simply because reading other people's code is harder than writing your own and it would be easier for me to do that than to understand the existing code and the context in which it was written".
That's the compromise users have to make when using anything that's mass produced. If users want software individually tailored to them, then it's gonna cost them a fortune. This way, they get things for cheap because all the customers are paying for the same thing.
Does it really though? Those are trivial problems to solve: dont show users tools they dont use via roles/permissions; dont execute code that is unrelated to the current purpose.
Roles/permission code is unrelated to the current purpose.
Really though, it's more of a systemic issue. There're so many ways you can make decisions to make your software make decisions to conveniently pretend you're giving the user exactly what they want without having asked them what they want. Maybe they don't want to download and install code they don't plan or know how to run. Bloat is problem for users because it has machine costs, which users have to pay for in time, money, and attention.
True to a certain extent, but to come back to an example given by the author; Apple Preview vs Acrobat Reader: both are successful pieces of software used by millions, one is a lot less bloated than the other though. Some companies/projects seem a lot better at avoiding bloat than others, for a multitude of reasons (culture, business model, ...).
I doubt I'm the only one who spots that "bloat" is highly subjective, thus defining and categorizing bloat depends on who is doing it.
As a developer, I frequently categorize and define bloat as large amounts of extraneous code from libraries and other tools that are not directly used by any part of a given application, but are nonetheless imported or installed by a project. Here, I typically go pointing the finger at requirements files, sloppy imports, and other dependency-related smells.
As a manager, I may define bloat as anything in the above dependency realm when I'm working with developers. Alternatively, anything in the realm of features and functionality that do not have a defensible position as being built for and used by a majority of users is sure to get some dirty and suspicious looks from me as bloat. This is typically triggered when a client or other project stakeholder suggests Feature X be added so Person Y (or Customer Z) doesn't have to use some other Tool B. I push back hard on such suggestions until there's a demonstrative case to be made that this kind of bloat won't become a major liability in the code from a long view, or a stumbling block for the majority of users.
As a user, I often define bloat as any number of features I don't personally use and, through various interactions and use-cases, feel get in my way, clutter up my UI, whatever.
Everyone involved (except users) is responsible for the bloat in their own domain.
Sure it is subjective, but there are certain places where it is obvious (like the stuff i am refactoring just now). We have 5 different types of fees - course fees, special fees, add-on fees, payment fees, extra fees (maybe I missed some). These are split between various database tables, and classes in the application, as a result there is way more code than is necessary. I am going to try and put them all in one table, with a "type", and reduce the code at the same time.
Bloat happens when programmers don't care about (or don't understand) what code actually does and what the tradeoffs down the line are going to be. Bloat happens when code is produced cheaply and quickly. Bloat happens when there is a cultural stereotype that having produced a bloated piece of code means you're smart.
Bloat is not the same as complexity, but they are related. Bloat is not always bad, or even perceivable as such by the consumer of the code, but at the same time it's also never good. Bloat is a natural accumulation tradeoff decisions.
Anybody who writes code produces bloat, because writing software is an exploratory activity towards solving a problem set. There is no way to always know where the bottlenecks are going to be, and where your chosen abstractions will break down. The difference between doing good work and bad is that the programmer doing good work will revisit decisions and refactor things ruthlessly down the line. The programmer doing bad work simply doesn't care, is too pressed for time, or writes code in a way that makes refactoring prohibitive.
Personally, I think a good litmus test for bloat is how much the code you're writing actually achieves for the effort you put into it. If you find yourself constantly massaging and circumventing the limitations of your boundary layers, that's a big warning sign. If you are using a library or framework that prolongs and complicates the effort of solving your problem compared to the expected effort of doing it from scratch, then that library is clearly not useful. Using it regardlessly is a deficiency in the critical thinking required to make good decisions in programming.
"... software bloat almost always comes from smart, often the smartest, devs who are technically the most competent."
I am quite happy with software written by people who who do not view themselves as "devs" who are "smart", "the smartest", or the "most technically competent".
I'm also happy with software written by telephone company employees in the 1970's, a grad student in the 1980's and a university maths teacher in the 1990's. Software written by "security researchers" is usually not bad either.
I like the small programs that win the IOCC; if I were asked to name a test for "competence" (my definition), the IOCC would be high on my list.
I have no reason to question that the "devs" writing the bloat, e.g., at Microsoft in the 2000's or any number of companies today, are _brilliant_.
However I have little interest in that class of software. I think bloat is stupid. Not to mention unnecessary for my purposes. If I cannot pick the code apart and recompile it myself, then the software is nothing but a liability to me, not an asset.
I couldnt agree more on what is said by adekok and PaulHoule . Working on a project right now and the product was lean and WORKED ! now thanks to un necessary trackers and unwanted features it has bloated like anything .
software bloat occurs exactly the way this discussion bloated.
read thru it... think we could rewrite it smaller and capture all the wisdom and make it more orderly? would you care to join the team and start doing that? :)
Most popular commercial programs get bloated or out of touch over the years... I'm just a user but I was thinking that it is this way because developers need something to do and don't like to waste time doing nothing.
One reason why bloat happens (not the only one) is that when we increase the number of requirements from N to 2N, the program size S does not simply go to 2S.
Sometimes when we add a requirement, numerous complications cause new lines of code to be added throughout the program. So the requirement doesn't have a fixed cost in terms of additional size; the bigger the program already is, the more lines are added by the new requirement.
Someone wants rational numbers in the numeric tower. Okay, so we have to implement addition of a rational to every existing type, including itself. Multiplication of a rational by every existing type as well as itself. Subtraction, division, ad nauseum. The cost in terms of code size for the requirement to have rationals depends on how many numeric types we already have.
Development by accretion of third-party code is the major cause of bloat. Third party components are already getting bloated over time due to the global bloat effect. And here you are, just mashing these components together to make your program. You don't use anywhere near the full functionality of these components. You link that entire five megabyte shared library because it has three useful functions. The seventy-seven others you don't use have to stay there because it's a shared library; some other program might attach to it and use a different subset.
These components get upgraded over time. Security fixes, enhancements, loss of support for old versions. Several years later, you're still just using only three useful functions in that shared library, and none of them have changed. But it's no longer five megabytes; it's now ten and has 157 other functions that you're not using, not 77. You hadn't even done any work to get that five more megabytes of bloat; you just triggered some automatic update a bunch of times over several years and ran some regression tests that passed.
Even the same code gets bigger when compiled. Thirty years ago, the functions were packed together, as well as their internal basic blocks: there was no advantage to any special alignment for branch targets. Loops were not unrolled and code wasn't inlined much: caches weren't big enough so it was often counterproductive. C++ templates came into the picture since then: let's generate umpteen variants of the identical code into slightly different machine language dependent on type.
Managed languages: these drag entire platforms with them. And the internals are "not user serviceable": you generally don't rebuild these things to remove what you don't want. You accept the platform with all its bells and whistles and work your app into that. Their bloat is nothing new. In the 1980's, some Lisp implementations were criticized for making 20 megabyte images (at a time when that really hurt). The rest of the world caught up to that: we now have non-Lisp languages that have lexical closures, garbage collection and ... huge images, far in excess of 20 megabytes. Today's managed languages, being platforms, have naturally copied the concept of package management from actual operating system platforms. It's super easy to bloat up the image just essentially by adding simple declarations of what you want in it, chosen from a large online catalog.
A good design will often increase in size and complexity as a factor of nlog(n) of the number of features. A mediocre design by n^2. And a bad one will increase by n!
The problem I've been wrestling with a lot lately is that developers with a good design don't see the problem in accepting all of these change requests. All these features are easy for them to add because Architecture!
What they discount is that architecture doesn't get you out of testing the interactions. So that nlog(n) development effort often turns into n^2 testing overhead, and occasionally worse.
Sometimes at least recently. I feel that sometimes software for example the default apps on a phone are doing more harm than good by taking up much needed space.
> software bloat almost always comes from smart, often the smartest, devs who are technically the most competent.
That's not my experience.
My experience is that software bloat comes from people who think they know what they're doing, but actually don't.
Competent programmers tend to be much more cynical about their own ability to write good code. They've been burned too many times by bad code, including their own.
The cases discussed here are not about competence or brains. They're about upgrading systems, which maintaining compatibility with existing systems. That is very, very, hard for anyone to do.
Robert Frost and TS Eliot had a similar conversation. Frost wrote eliot saying 'the problem with you is that you speak 13 languages and know nothing about real life'. Eliot responds 'the problem with you is that you're stupid'.
"Robert Frost and TS Eliot had a similar conversation. Frost wrote eliot saying 'the problem with you is that you speak 13 languages and know nothing about real life'. Eliot responds 'the problem with you is that you're stupid'.
Age old debate, never fails to entertain."
I'm too stupid to understand how this is supposed to be entertaining...
I find his response to Linus condescending. He calls Linus a genius yet he speaks to him like an idiot. Perhaps that was the intended effect, but I wouldn't call his "praising while bashing" approach being reasonable.
> Competent programmers tend to be much more cynical about their own ability to write good code.
Yes, and so instead of writing something from scratch, these prudent, smart programmers link in some well-tested, popular, widely-used third-party bloat, and repeat that a number of times to make a big program, fast.
Smart programmers definitely have ways to make hugeware.
One skill that smart programmers have is to skim through the documentation of some API with a thousand functions and immediately see how to combine six of those functions from completely different places in the documentation to solve a task. Of course, the entire thing is brought in!
(You wouldn't want to be taking apart some well-tested third party code just to make it smaller. There is the risk that something could break. Plus you create a maintenance headache since you're introducing a fork. Each new revision of the bloat has to be stripped down to size all over again. Your thoughts then turn to that 128 GB MicroSD card in your phone and suddenly the bloat looks small).
Are you saying that the only programmer who's able to avoid bloat is a not-smart programmer? The truth is that it's a balancing act and it's not always clear when bringing in yet another externally maintained library is the right thing to do.
Equaly the aspect of a competent programmer being curtailed to optimize the code to the level they are happy with.
A business wants software that works, if they can chuck some hardware at a problem and add more processing power then they tend to do that more often than optimize the process.
This and legacy build up upon many a program/system will end up with people who do not know all the parts and with that tread too carefully at times and lots of redundant, duplication and bloat inducing additions ensue.
Very true. It's like a cooking competition where you take over from someone else after being blind-folded. Unless you're from the same schooling chances are the result will lack direction. The equivalent in software is bloat. Note that sometimes that "someone" is your former self and you forgot what the idea was.
It's true in isolation, but on the whole I consider skepticism to be much more useful and constructive than cynicism. Skepticism is being mindful of one's assumptions, whereas cynicism is just the opposite of optimism. Alternately, skepticism vs. cynicism is "don't assume anything" vs. "assume everything sucks". The former is actively trying to avoid cognitive bias, but the latter is embracing one bias to counteract another one.
1. The user/client comes across edge cases, and has a very specific thing they want done in that situation. Unfortunately, it's the complete opposite of what the system would usually do, so said case has to be specifically patched to act in a certain way despite making the codebase a confusing mess to figure out. So one tiny part of a certain web page gets different padding or line height because the client/user thought it looked better in that one situation, and no one thought to say it was a bad idea.
2. The project's spec was wrong and hence it was the wrong thing to be building in the first place, but no one liked the idea of starting from scratch. So you end up with a program that's meant to do one thing but has been crudely hacked into doing something completely different and hence half the features are utterly useless.
So it's not the developers fault in these cases, its management not knowing what the team is actually meant to be building or spending a lot of time trying to provide exceedingly random and useless 'nice to have' touches.