A Codebase Is an Organism (2014)

code_biologist · on Aug 24, 2019

Catabolism is another aspect of thinking about code as an organic entity that I find useful. Anabolic (growth) processes and catabolic (teardown) processes always accompany each other in biological systems. Too often we neglect to continually tear down and remove unused aspects of our systems.

Both anabolism and catabolism are always operating, though at different rates depending on the environment and needs of the organism. The funny thing is that catabolic processes will happily tear down important parts of the system, forcing them to be anabolicly rebuilt. Things like bone and muscle mass. This seems wasted, but it has an extremely important purpose: uncontrolled anabolism is cancer.

I think the analogy to code is obvious. We've all seen codebases that have gotten out of hand (gotten cancer), and part of the reason they got that way is because the growth pressures had no countervailing teardown pressure. The market provides selective pressure in nasty ways: codebases and systems with cancer slow and die, and are replaced by younger codebases without uncontrolled growth.

How do we break that cycle? By encouraging catabolic activity culturally within our companies and codebases. Cleanup is incredibly valuable. Functionality that isn't being heavily used and is complicating implementation of new work should absolutely be on the chopping block. We don't have to be quite as aggressive as biological systems, but we shouldn't take their example lightly either.

This also provides insight into when cleanup doesn't matter: when the survival horizon of the organism is sufficiently short that cancer frankly doesn't matter. Startups shouldn't worry too much until they have product market fit. Once it's clear they'll be around for years more though, catabolic work will help ensure future health.

ryanblakeley · on Aug 25, 2019

username checks out. that's a really thought provoking analogy. there's definitely something to be gained in the process of tearing down and building back up.

taneq · on Aug 24, 2019

I've been waiting for a long time for people to start using the title "robopsychologist" in earnest. Especially for people who are called out to debug an unfamiliar system, in 'the wild' (ie. in production). You have to calm the system down, find out what it thinks is going on, gently correct any misconceptions, and guide it towards a better understanding of the world that lets it behave the way you need it to.

lioeters · on Aug 24, 2019

Good one, yes I can imagine the field branching out and developing disciplines like Software Psychiatry, System Nursing, Geriatric Care for Legacy Codebases..

Coming soon: Diagnostic and Statistical Manual of Software Disorders.

andy_ppp · on Aug 24, 2019

It’s commentary like this that makes me come back here and also I wish hacker news had a few more features, like being able to follow people who make interesting comments.

grawprog · on Aug 24, 2019

You could always start a bookmarks folder for 'HNers I find Interesting' and save their profile to that. I bet there's a way to write a script that goes through them, checks for posts since the last update and display them.

kazlock · on Aug 25, 2019

> two modules will tend to grow ever more dependent on each other unless separated by hard ('physical') boundaries

This is especially true in monorepos. Code resuse has it's benefits, but left unchecked the dependency graph can turn into a huge highly-connected glob. Modules end up accidentally importing code they have no business importing because of sneaky transitive dependencies. The blast radius of even small simple changes becomes enormous, and testing/debugging becomes more and more complex.

A great way to keep this in check is to have apply code isolation in the test environment. When you checkout the entire repo for a build or test, its easy for these kinds of dependencies to grow unnoticed. But if you require build/test targets to explicitly declare what code they depend on (and only make that code present when running them), changes to the dependency structure must be explicitly acknowledged in code review. This is one of the core principles behind build tools like Bazel.

KajMagnus · on Aug 27, 2019

> This is especially true in monorepos

Not in my monorepo.

I have compile time boundaries between modules and I cannot make them particularly entangled with each other.

Maybe the problem you have in mind, is more likely to happen in dynamically typed languages? Where there's no compiler who can say "No."

sitkack · on Aug 25, 2019

Even in monorepos, calls into a module should go through a versioned interface.

Groxx · on Aug 26, 2019

In general I agree, because it allows gradual migration of users... but in essentially every case I've seen, removing versioned interfaces is billed as the primary feature of monorepos.

sitkack · on Aug 27, 2019

I just realized monorepos are analogous to a flat network, everything can reach everything else, or that main memory is flat/linear/same cost for each read/write (not true, but it is the abstraction we believe).

I now believe that monorepos can only work well when there is a mechanical tool for ensuring correctness and that refactorings can be done atomically across the whole tree. Infact, it might be necessary to _only_ commit the refactoring operation to the tree and the source itself. That the whole tree is a blockchain of tree edit operations.

KajMagnus · on Aug 27, 2019

> monorepos can only work well when there is a mechanical tool for ensuring correctness and that refactorings can be done atomically across the whole tree

Isn't a compiler such a tool? I use a statically typed language, and if I would try to do this:

> everything can reach everything else

then there'll be compilation errors.

Groxx · on Aug 27, 2019

which is one of the reasons I like microrepos and versioning: you can do refactorings across subtrees. if there's an edge case you want to address later, you can address it later. if you eschew versioning entirely, you effectively have no choice but to do everything all-or-nothing.

j88439h84 · on Aug 25, 2019

I'm not familiar with that, how does it look? Docs maybe

js8 · on Aug 24, 2019

A codebase is an orgasm. People working on it get excited when it is starting, and when it's done, they become very tired of it.

andy_ppp · on Aug 24, 2019

Unless they really feel love for their users?

Ugh, too far :-/

preston4tw · on Aug 24, 2019

Two things: mods can we get the title updated to reflect the publish year of 2014?

The second is that the codeswarm link in the article is dead. This project on github is the closest thing I can find to any remnant of the project: https://github.com/rictic/code_swarm

ivan_ah · on Aug 25, 2019

Check out gource, which does something similar https://gource.io/ (git history visualization)

gandalfgeek · on Aug 24, 2019

Not just code bases, but it is useful to think of large planet scale systems in organic terms.

https://blog.vivekhaldar.com/post/6972614229/large-computer-...

goldenkey · on Aug 24, 2019

Might as well just ask the question: What is Life? And then realize that scale of space or time is irrelevant.

A galaxy could be thinking like a brain, except thoughts travel light years, are mainly dictated by GR (Light and Gravity), and take eons to actually occur.

I think it's clear that the only meaningful nature of scale in time and space for life is just the one we artificially put on it, because of our conscious experience.

LOTR and other fantasy explore this with the idea of trees having much wisdom having lived so long.

dwaltrip · on Aug 24, 2019

I once saw a comment similar to yours on HN, and there was a very good reply that critiqued the idea. I'll do my best to replicate something along the same lines.

If we look at the total potential lifetime "clock cycles" of a system, and compare that number to systems that we know are considered intelligent and capable of thinking, it might shed light on the plausibility.

It takes information 100,000 years to cross the galaxy, and the universe is 14 billion years old (the galaxy is younger, but not enough to matter for this), so roughly a maximum number of 100,000 round trips could have possibly occurred in this "galactic brain".

Let's use a very conservative metric, such as how long it takes a human to blink in response to stimuli (about 100 ms), for our "human clock cycle". If a human lives for 80 years, that is about 2.5 billion seconds, or 25 billion clock cycles.

This is roughly 7 orders of magnitude (10s of millions) more than in the galactic brain. This gives the a very loose impression that a such system would probably have not have enough information flow or feedback loop cycles to "think" anything interesting.

Of course, this a very rough heuristic, but I think it is interesting and useful. The idea of unexpected or strange "thinking systems" is very cool and we should continue to explore it, but there are certainly some hard constraints defining the space of such systems.

goldenkey · on Aug 24, 2019

That's a good point. I don't have time to double-check your numbers but I will assume they are correct.

My only rebuttal (if I'm to take an opposing position), would be that different organisms have different clock-cycles, an example would be organic life, hummingbird metabolism versus elephant or whale metabolism. I don't know if their brains are also faster, but since their reaction speeds do seem to be connected to their metabolic rate, it's probably the case.

Considering architecture, the way intelligence is being created through Machine Learning is through cellular units that carry large amounts of information digitally instead of oscillating analog signalling. In fact, I found that when building cellular automata, black-white automatas are elegant in theoretics, but when it comes to what will deliver the most complex/intelligent dynamics on a GPU, it is using each cell/pixel as a complex number, floating point, or vector of such types. The GPU can do so much with single units, so it makes sense to use its full potential.

It's possible that the encoding/signalling of galaxies is similar. Clock speed only has to be as fast as things you _can_ react to. Can a galaxy move fast enough to dodge an asteroid? Likely not.

It'd probably be enlightening to know more about how the brains of larger, slower animals such as elephants or whales operate. Their "clock speeds" might shed some light on what kind of parameter ranges life operates within.

EDIT: one more thing I thought of is that it's very possible that the living stage of the larger structures of our universe is simply young, adolescent in nature. Perhaps we are living at a time when the intelligent behemoths among us, or rather, that we are apart of, are of their first generation. I agree though, it's more unlikely to be seeing the head or tail of a distribution than somewhere in the middle. But we should not rule out young, large intelligences.

dwaltrip · on Aug 24, 2019

> My only rebuttal (if I'm to take an opposing position), would be that different organisms have different clock-cycles, an example would be organic life, hummingbird metabolism versus elephant or whale metabolism. I don't know if their brains are also faster, but since their reaction speeds do seem to be connected to their metabolic rate, it's probably the case.

That's a fair point. I think the comparison would have a somewhat similar conclusion, perhaps not as dramatic though.

Either way, "clock speed" / "number of clock of cycles" is only one way to analyze such a system.

Another important aspect of an intelligent system is the computational complexity of each node in the system and their ability send "messages" back and forth. For a brain, we could look at the neuron. My limited understanding is that neurons are actually pretty complex from an information processing standpoint, despite the implied simplicity we see with artificial analogs such as the neural nets used in machine learning.

I can't imagine what sort of naturally occurring entity would be able to function as an analogous "node" of a galactic brain, or even what sort of messages could be used do to do complex information processing. Stars don't look like they could fulfill the role. In terms of "information processing" I don't think they really do much.

Anyways, it is all very interesting. It is probably possible to have a much larger and very different "thinking system" than what we find on Earth, although I'm not sure what it would look like. Bu I am skeptical that such a system could span the enormous distances between stars.

By the way, you should check out book The Black Cloud [1]! It is a fun science fiction story that explores this idea.

[1] https://en.wikipedia.org/wiki/The_Black_Cloud

jsty · on Aug 24, 2019

> I once saw a comment similar to yours on HN, and there was a very good reply that critiqued the idea

This one? - https://news.ycombinator.com/item?id=20219354

dwaltrip · on Aug 24, 2019

Haha yes! Well done :) How did you find it so easily?

jsty · on Aug 24, 2019

I thought I remembered the general gist of the argument, and a quick HN search for "brain speed light" [0] brought it up as the first result ... so probably pure luck!

[0] https://hn.algolia.com/?query=brain%20speed%20light&sort=byP...

dwaltrip · on Aug 24, 2019

Makes sense, thanks for sharing the link!

eberkund · on Aug 24, 2019

That assumes that for a "thought/cycle" to occur, information would have to traverse the system from corner to corner. It is possible, that most thoughts can come about in localized areas independently.

dwaltrip · on Aug 24, 2019

But wouldn't that be more of a loosely connected and mostly independent set of localized "brains", as opposed to some sort of "galactic brain"?

nikki93 · on Aug 24, 2019

Wouldn't this just mean that we happen to be in a young universe?

dwaltrip · on Aug 25, 2019

Sure, it could be possible that unusual forms of thinking systems could form given much longer timespans.

However, there are probably many other factors besides simply besides the number of "clock cycles" that constrain the circumstances under which such systems can form. For example: the computational and communicative capabilities of each node in the system. I can't imagine any "node" analogous to the human neuron that could form a thinking system spanning large regions of space. For example, stars don't seem to have any complex information processing or messaging capabilities.

nikki93 · on Aug 24, 2019

This is now one of my favorite comments on hacker news. Thanks! I've been wondering -- how to recreate this feeling in an MMO? We can definitely have scale changes by letting people zoom in or out and changing their character avatar size and walk speed accordingly. But changing apparent time is hard. Unless you can "zoom in," program a bot, then zoom out and then it's running at a faster rate then you zoom in and see a new civilization it created, etc.

goldenkey · on Aug 24, 2019

Thanks. I am flattered :-)

That would be an awesome theme for a game. Be a cell, an insect, a tree/ent, a human, a planet, or a galaxy. Change the scale of time and space, watch civilizations rise and fall. It's an epic thought. Would probably be a very hard game to program though.

crimsonalucard · on Aug 24, 2019

There's no isomorphism between life and the galaxy. Life requires a mechanism for mutation replication in order to facilitate natural selection and the lowering of entropy in a local system.

The galaxy overall doesn't exhibit a single one of these properties at a high level. Entropy is rising and there is no mechanism for mutation let alone replication.

goldenkey · on Aug 24, 2019

There's an isomorphism between life and stable structures. Most stable structures seem to be plentiful, and it isn't obvious whether it is because of nucleation/replication or because of independent emergence. There are plenty of galactic objects that beget other galactic objects in a self-propagating chain. Just look at what stars produce, and what stars form out of. Similar mechanisms apply to galaxy clusters.

Are you familiar with prions? Well, their form of reproduction _is_ nucleation.

gfodor · on Aug 24, 2019

framing failing fast or not as a conflict of interest between computer and codebase is an interesting one. I’ve often seen it as a conflict of interest between the user and the developer: users want the code to muddle through (“PHP style”), but devs want highly visible obvious failures early so they are easily identified and fixed.

How do people balance this? The “state of the art” still seems to be things like “have asserts turned on in the debug build” but a holistic approach where the application has different runtime contracts with regards to failures based upon who is using the app seems somewhat underutilized — I’ve done it piecemeal with application feature flags but has anything like this been done at the platform/language/framework level?

carapace · on Aug 24, 2019

A codebase is a mechanical approximation of a set of human intentions. It's a model of an organic phenomenon.

lonelappde · on Aug 25, 2019

A code bases that manages a data store is not a model, it is a phenomenon.

euske · on Aug 24, 2019

> In this way, building software isn't at all like assembling a car. In terms of managing growth, it's more like raising a child or tending a garden.

I knew it! I always thought maintaining my project was like doing a bonsai tree!

(Disclaimer: I have zero experiences of bonsai.)

lonelappde · on Aug 25, 2019

Bonsai stay small as they age. Sadly, software programs do not.

inimino · on Aug 25, 2019

because the gardener trims them both above and below ground.

qes · on Aug 24, 2019

Not even just the source code. Our system, even with the source code static, has constantly changing characteristics.

We integrate with dozens of 3rd party APIs, making over 10M external HTTP calls per day. That leads to a lot of variability in runtime characteristics.

longcommonname · on Aug 24, 2019

This is true for us too. The behavior of ours is due to how airline fare rules are filed which creates all sorts of silly things.

qntmfred · on Aug 24, 2019

Always liked the garden metaphor https://blog.codinghorror.com/tending-your-software-garden/

lonelappde · on Aug 25, 2019

A production system is like the TARDIS.

You don't program the system, you negotiate with it.