Hacker News new | past | comments | ask | show | jobs | submit login
It's Not Sabotage, They're Drowning (shermanonsoftware.com)
144 points by jeffrey-sherman on Nov 15, 2019 | hide | past | favorite | 101 comments



I can perhaps understand where this one is coming from:

I refactored the code from untested and untestable, to testable with 40% test coverage. The senior architect is refusing to merge because the test coverage is to low.

You have a piece of software running in production; even if it's poorly written and bereft of tests, your customers are satisfied. Customers (unwittingly) have already vetted this version! A rewrite, even with brilliant design and fantastic test coverage, is still more of a risk than fixing what isn't broken.

This isn't to say that you shouldn't rewrite code, especially if it makes other improvements possible. But don't discount how high a bar new code has to reach to measure up with "it already works". Maybe 40% is still too risky.


  a rewrite... is still more of a risk than fixing what isn't broken.
Defining "broken" is the problem. A long-extant program may be "broken" even while it meets customer expectations, and "fixing" it could be viewed as a regression by the customer.

I had a case early in my career where I was finishing up a rewrite of a report writer program. The old version didn't really have test data, so "testing" mainly meant comparing outputs of the legacy system vs. the rewrite. After I had sussed out all the errors, there were still some differences in which the legacy output just didn't make sense. I met with the customer liaison, and he checked with his users.

It turned out that nobody had bothered even to glance at those specific reports in as long as anyone could remember.


So... did you correct the errors and release your updated (useless) report? Or did you slog through 7 levels of management approval to get permission to get rid of it?


No, I just had the customer rep send a note appeoving the new system with unwanted reports removed. (This was a mainframe shop, where excess CPU, I/O, and printing had meaningful costs.)


Just because it works doesn’t mean it isn’t broken.


And just because a developer thinks it's broken doesn't mean it doesn't work exactly the way the customer likes it.


Software is done when nobody is willing to work on it anymore.

There are some things you do for employee retention in every industry.

Typically the bits we are talking about are the bits the customer doesn't know about. Or they're caught in a cognitive dissonance loop where they want two things like features and stability but they won't agree that the two affect each other. So they ask for one until problems arise and then they pretend like we should have been honoring the other the whole time. Even when they told us it wasn't a priority.

It's basic maintenance philosophy. It's not a priority to get your car to the shop. But it is a priority that it work reliably for your Thanksgiving trip in two weeks. Many things worth doing are inconvenient. As are many critical things.


You don't want to test code. Coverage is mostly useless.

You want to test functionalities: setup end-to-end tests. It will be hard, it will take time and it often is a solo endeavor for the first months. But when tested functionalities stop breaking for no reasons even when you refactor a lot of code, when "the project for which a dev put tests" starts being "the project we are confident to push to prod without breaking things" you should get some traction.

And once you think all your functionalities are tested you can measure code coverage: what is not covered should be dead code you can remove.


Pointing people only at end-to-end testing seems poor advice. There is value in a mix of all sorts of testing. I agree that testing functionality blows counting coverage by the line out of the water. But since a lot of code that implements algorithms, string manipulation, etc. can be tested just fine with unit tests, and since end-to-end tests are expensive, easily more brittle, require a lot more dependencies, it would be better to have a healthy mix of different types of tests.


The comments was specifically about the use case of getting a large refactor merged.

If you are simply adding code or making simple changes, then unit tests can have advantages. It is when you are making a large refactor that you need end-to-end testing to verify that nothing was broken.


I can't get behind this.

End to end tests are indeed important.

But let's say you have an end to end test that covers A, B, C, D, E, and F.

One day this test stops working. Well, what's the culprit? A? B? C? D? E? F? If you had more granular tests in addition to the end-to-end tests, you'd know.

Now, I agree there's a possible downside here. Maybe there comes a time when C and D are no longer being used. It might not be obvious and C and D could linger in the code base, with corresponding dead weight in the test suite. This is not ideal. However, I would rather have this than a tough-to-debug test failure.


> Well, what's the culprit? A? B? C? D? E? F?

I honestly don't care. I know this merge is not touching production. I could have all of A, B, C, D, E and F test suites working individually if they break when put together they are all bad.

The developer have tests so they know exactly how to reproduce the bug. Time for them to earn their salary.


I see this over and over also in organizations I work in and even have to fight it in my own behavior..

If you run an end to end test automation and it fails and then you are afraid or otherwise hesitant of doing the process manually you are out of touch with the system and probably missing the forest for the trees when you work on subsystems.

While I would like to have better end-to-end tests, I know that I will have to fight more problems of getting too far away from how the system works.


> Well, what's the culprit? A? B? C? D? E? F?

Probably, the one you just changed. How often is it really going to be that hard to debug?


I only changed L, which is nominally unrelated to all of those. You thought this was a clean, well-modularized codebase?


    You thought this was a clean, well-modularized codebase?
Well, even if it's nice and clean A, B, C, D, E are each going to have their own dependencies!


If you change B and the A/B/C/D/E integration test breaks.... duh, it's obvious. Yes, that's easy.

But A, B, C, D, E are each going to have their own dependencies.

You could change X, Y, and Z and now the A/B/C/D/E integration test breaks. Which one was it?

Or maybe you're super cautious and you only change Z. And now the A/B/C/D/E integration test breaks. Obviously it's Z. But how? Why? It's a four-developer team and you're only really familiar with R, S, T, U, V, W, X, Y, and Z.

There's also the fact that unit tests serve as living, de facto documentation on how to actually use the classes. You can both use and iterate upon Z a lot faster and more confidently if it has its own tests.


The overuse of overly granular services can make this pretty awful unless you've put event tracing in place.

I mean, the obvious answer is don't do that, but this is the world people are embracing.


Coverage is very useful in determining whether your tests actually do what you think they're doing. It's not very useful as a metric to be optimized. Writing good end-to-end tests that really cover the functionality as specified is very difficult, doing so blindly without tool support is doomed to fail.

An even better tool than measuring coverage is setting up a mutation testing framework and adding tests until all mutations are caught.


I agree, and I think there is an arrogance by those with the view that 'coverage testing is useless' which blinds them to the real purpose of code coverage testing: to tell you what you haven't tested yet.

Coverage isn't supposed to tell you anything about the quality of the tests beyond what code simply has never been tested yet.

And yet, people seem to think "code coverage testing" means something else entirely. The only purpose for paying attention to coverage metrics, as a manager, is to understand how much more work there is to be done on the tests - or, to determine code paths which never get executed and are therefore dead weight on the project.


> the real purpose of code coverage testing: to tell you what you haven't tested yet.

I can get 100% code coverage with tests passing on:

    // Returns the result of A*B
    int mult(int A, int B) {
      return 4;
    }
I won't know the functionality is not tested. Code coverage will just be an illusion of security.


Code coverage is not all you should be doing.


Tests need to cover small units one at a time. I've been bitten too many times by tests that were too complex and attempted to test too many things at once. They invariably ended up doing nothing of value. Test setup becomes overly complex and prone to oversights. Writing another test for a variation of a problem that takes a slightly different code path becomes a chore. It also gets hard to spot that odd untested corner case hidden deep inside some complex logic.

The proper way to think about this is to have unit tests that cover the nooks and crannies of the business logic and integration tests that make sure that interfaces are correct.


My thoughts exactly. I'm not merging a bunch of untested code that replaces a bunch of working code, no matter how bad the old code was!

That seems like a recipe for a production nightmare.

The fact that this situation even arose in the first place would be bad news.

1. Did the other dev just refactor the code on a Red Bull bender one night on his own time? Hopefully this refactor work is something we would have discussed ahead of time. Hopefully I wouldn't have to even tell them that the expected coverage for a big refactor like this would be darn close to 100%.

2. I'd phrase this rejection in a positive way. I'd be excited about the refactor and testability and I'd let the other dev know that I appreciated the work and was looking forward to getting it to 100%, testing and merging it

(I realize the example in the story was purely hypothetical and this didn't actually happen. :P)


It can be defended when the system is in a rather passive form of maintenance. E.g., every month a small bugfix. If it is more than that it becomes way to easy to fix one small thing while breaking three. Especially if the functionality is a little or a lot complex.

What it does sound like, though, is that everything was refactored/covered in tests in one fell swoop. I find that questionable. I think it is much better to start out small in an untested code base. Just cover the small piece you are working on now in tests and gradually extend.


Some code is so mangled together that trying to start small is more of an investment than overhauling the whole thing altogether.


It's always a lie, about running in production. I've personally seen untested code allegedly running but it isn't actually working. Defect numbers through the roof and test harness literally not being merged exactly because of this issue.

You will get visibility, either by customer complaints, slowing down development speed and ultimately projects being shut down. It's not sabotage and it's not exactly drowning. It's willful attempt to ignore reality and cheat on metrics, prolonging the inevitable and defrauding stakeholders.


It's always a lie? You're basically arguing that code with bad test coverage never works.

Production code is copping constant manual end-to-end tests. Those tests are more valuable than most unit testing, and some automated black box testing. I'll take stable production system over the unit tested new code any day of the week.


Here's the an important piece of real-world statistics for you, which I've read about in research papers and personally verified in large-scale systems:

For every 1 complaint or help desk call you receive, there are between 100 and 1,000 users that experience the same issue but didn't bother to complain or call.

Production testing is literally 0.1% to 1% as effective as actual testing.


  between 100 and 1,000 users that experience the same issue but didn't bother to complain or call
Which is often because users have no (established, straightforward) process for reporting, or recognizing, a defect.

I found an ugly 2FA bug in a major ride-sharing app's onboarding and was shocked to find that they have no informal mechanism for reporting any defect. (I thought it important enough that I went through the hassle of reporting it through HackerOne, not seeking compensation but simply attention to the problem. It was closed within minutes without a glance or meaningful comment.[0]

I'm not shocked anymore, as this seems to be increasingly common.

Another growing trend is meaningless and even juvenile error messages. "Oops!" (with no context that could aid diagnosis). "Something went wrong." (ditto, and I see this out of the leading ticket site 300 times a year). Etc.

[0] it's possible that it was closed because that particular onboarding was later obsoleted, but still, that's just a bad look.


Depending on your user numbers, this works out pretty well. If you've got 100k users, you're likely to get reports on anything that hits at least 1% of them.

In my experience, you can write much simpler code that only needs to work in production than code that also needs to be unit testable. Adding hooks and separating bits and pieces so they can be tested can often make the code much less concise and harder to follow.

I would rather spend engineering time making sure problem reports from users get read and escalated and addressed than on testing things that are better left untested. Unit tests, and even integration tests are unlikely to let you know when the feature works as designed, but not as users expect.


My favorite example of "things that are better left untested" so far is a method that returned a hard-coded array and a test which checked whether the method really did return that hard-coded array. So, if you wanted to add something to that array, you now had to add it in two places. Sure, that's an easy way to increase test coverage, but maybe you should focus on areas where unit testing makes more sense first?


> Unit tests, and even integration tests are unlikely to let you know when the feature works as designed, but not as users expect.

This is really missing the point of automated tests. Unit tests are for developers and ensure that low-level functionality doesn't break when changes happen. Integration tests verify that components fit together and that it stays that way. End-to-end-tests ensure that user functionality doesn't break.

Together, the above turn refactoring and rewriting of components from a business risk into an activity that can happen simultaneously when adding new features. This can ensure that there would never be such a thing as legacy code.


Legacy code is just code that has proven itself to be useful to the business. If you're telling me spending all day fiddling with testing is going to mean I have nothing to show for it, I guess I agree.

I think there is room for testing, but you need to really restrict the scope to those parts of the code where the requirements don't change often, and where the natural interface is amiable to testing, and often, you need someone whose job is testing and who didn't write the code to actually do it. Incidentally, those parts of the code where the requirements don't change often get to a point where the code doesn't change either, which should mean the tests won't fail, because the inputs stayed the same.


I prefer to fix bugs before they hit production. It is almost impossible to do that without tests. Also testable code is usually much more readable than untestable god objects, tightly coupled modules and spagetti code.


without tests doesn't mean bad architecture.

I can go the other way and say that code that ends up being 'modified' to be testable, with 43 mocks to just be able to setup a class is as much crap as code without tests.

I would say I never achieved more than 20% or so of test coverage in my products and production code. It doesn't mean it is a mess of tightly coupled modules in any way.

But TDD and any other dogmas in development are like this. Solution A prevents (maybe) a problem, thus if you aren't using it all other code has that problem and defendants will just keep trying to make it so.


If you need 43 mocks to unit test a class, your TDD has already found a bad smell. Why are there so many classes tied to a single component?


Video games frequently succeed despite colossal bugginess. Pokémon Go’s launch is a great example. Engineers really undermine their authority when they talk about testing and quality in abstract and isolation to the user’s experience.


That's the kind of thinking that leads to situations like this:

https://www.digitaltrends.com/mobile/samsung-galaxy-s10-ultr...

A lot of the time it's just as important to make sure certain things can't work as making sure that certain other things will.


> It's always a lie? You're basically arguing that code with bad test coverage never works.

It may work, or it may not work. Without the tests you won’t ever actually know.

Maybe your customers are bouncing on bugs, and while you are earning enough, you could have been earning 3 times as much without those bounces.


And then you have tested code with useless unit tests that cheat on metrics and defects and customer complaints very high too, because nobody is listening to customers either way. I regularly run untested code (when appropriate, it’s a trade off for simpler design with less abstractions sometimes) with happy customers.


We may have diluted the meaning of refactor if we are arguing about pull requests.

Refactoring is an activity of small steps of a particular kind that are generally sound transformations. The biggest risk is transcription errors.

Of course one of the downsides of PRs is that the they muddle a sequence of individual commits and only show the cumulative result.

When dealing with individuals with these sorts of fears, you have to be patient and strategic. Pick the most powerful refactor out of a set of many useful ones and do them one at a time until you achieve the result. Essentially you have to boil the frog.


>Refactoring is an activity of small steps of a particular kind that are generally sound transformations. The biggest risk is transcription errors.

I think that is far from an established sole meaning of the term.


And then I will repeat

> We may have diluted the meaning of refactor

And I will quote the summary for Refactoring by Martin Fowler:

> Refactoring is a controlled technique for improving the design of an existing code base. Its essence is applying a series of small behavior-preserving transformations, each of which "too small to be worth doing"...

If words don't mean anything it's really hard to have conversations and it's insanity to try to create strategy around them.


40% code coverage is meaningless

It could be 40% weak unit tests that just go through the code but doesn't test much

I would be happier if it were 40% coverage running integration tests with actual data and use cases


That security hole that is leaking customer data hasn't been noticed for the 9 months since it was implemented. Except by that one guy who has captured it all. Customers have only vetted what they see.


Nobody’s better at letting better be the enemy of good than software engineers. For some reason our culture likes to argue and nitpick over the most mundane stuff, usually missing the forest for the trees.

I’ve had jobs where the manpower spent just arguing outweighed the cost of the change significantly. One dickhead manager I had spent weeks arguing over the acquisition of a $500 SSD. It made me want to slap $500 down on the table just so he’d shut the fuck up already.

The only way I know to counteract it is by saying “don’t let better be the enemy of good”.


I had a manager argue over the cost of a training class once. They'd moved me to a new team with unfamiliar tech and I wanted to bone up on some concepts before getting started.

It was an $11 online class on Udemy.

(Well, technically it was a "$199" class, conveniently discounted to $11. You know how Udemy rolls.)

Generally, I'd say nobody's that petty. A manager would typically only be that petty if trying to get somebody to leave. However... this guy was not very good at his job.

I don't think he was literally upset over $11, and clearly our relationship was on the rocks at the point, but I'm not sure what his exact problem was. May have been me on a personal level. May have been the fact that I wanted to spend a working day uh, learning to do my job better. Perhaps he didn't feel anything on Udemy could be worthwhile. Perhaps it was just his fragile yet rampant ego. Could have been any of those things or all of those things. Nonetheless, it's hilarious that an $11 class was where he attempted to draw some sort of line.


I had to explain to my supervisor's boss' boss (Division VP of a $800M USD/year company) for an hour on why I needed a $45 piece of software to read/understand binary files.


A few points:

1. Computing and engineering in general is such a discipine where one nitpicky detail can make all the difference. One bit flip, one keystone, one badly selected material. However, the conduct of negotiation, could be improved as always, but it's a general listening problem most of the time.

2. From the dickheads point or view, everything can be solved by spending more money on HW. Each unexpected spend like this is a risk to open the floodgates and lose both financial and engineering discipline by setting a bad precedent. Not sure you want easy handed decisions there as well.


It makes no sense to cheap out on hardware though. It's such a tiny cost compared to the overall cost of the employee, you are wasting money if inadequate hardware is making your employees less productive.


In one sense, it doesn't make sense. The productivity boost would likely be worth it alone.

From another perspective, this is a +500$ post in the cost column of that unit (and perhaps n*500 if the whole unit then comes asking for the same thing). The cost of spending time arguing is baked into the cost of salary, meaning it doesn't show since it's the same regardless. It's a form of optimization, but the optimized unit is not productivity.

A friend of mine was working as a recruiter where the whole company was sitting all day long at their laptops. He was denied buying an external screen to be more efficient since he then could have more information on the screen (eg an application next to a browser), "because then everybody wants one", even buying it himself.

edit: nb I'm not arguing for not buying the ssd/screen.


    He was denied buying an external screen to be more
    efficient since he then could have more information on the
    screen (eg an application next to a browser), "because 
    then everybody wants one", even buying it himself.
My wife was denied permission to bring her own secondary monitor from home for that reason. Unreal.

I also faced pushback at one job for bringing my own monitor from home. Same reason given. I'm an entitled software developer though, so I basically disregarded my manager on that one and he let it go.


We set up a monitor for my wife at her large corporate job without asking permission. While the rest of her office had tiny 1280x1024 screens from 10 years ago, she had a 1920x1200 24" monitor in addition to the corporate shit monitor.

Sometimes people would ask her about it, she would say she bought it herself, and they would say "oh" and move on about their day. It was a complete non-issue.

You could argue people shouldn't be plugging in random components at work and whatnot, but at the end of the day, IT is really powerless to do much about it. Nobody's going to fire you for buying your own productivity enhancers, the situation is just too embarrassing for management if they're reasonable people.

And really, Excel is so focused on visual space that having a larger display for her is a quality of life issue that only costs $120.


I completely agree, but there really are managers petty or shortsighted enough to block this stuff.

I think one mistake my wife made, though, was mentioning it to her boss before she did it.

What she should have done was done it. Then her boss would be more inclined to let her keep it.


> From another perspective, this is a +500$ post in the cost column of that unit (and perhaps n500 if the whole unit then comes asking for the same thing). The cost of spending time arguing is baked into the cost of salary, meaning it doesn't show since it's the same regardless. It's a form of optimization, but the optimized unit is not productivity.

I'm not seeing the point here. What is* being optimised for? If there's an easily-worthwhile productivity-boost, what else is there to say?

If the whole department would see a similar increase in productivity from scaling-up the expenditure, isn't it the job of management to ensure it happens?

> He was denied buying an external screen to be more efficient since he then could have more information on the screen (eg an application next to a browser), "because then everybody wants one", even buying it himself.

This is just plain stupidity, isn't it? Is there something I've missed here?


You are not missing the point, is sure is plain stupidity I'm highlighting :) If it was me I would buy the screens in this case. Just meaning that some may optimize for keeping a entirely separate column of "other expenses" to as low as possible (for whatever reasons) even if it hurts overall productivity. Seen it happen.


That's just a long way of saying myopic, no?


Spening a few hours of time on the topic is literally more valuable that the ssd itself. So what if you spend a few k per employee more. It is basically a rounding error in overall comp. This is also why I dont understand companies trying to save money on perks...


But we’d be paying $50 per month for free coffee! The 5% productivity boost is just not worth it.

It’s like they deliberately avoid facing facts.


We once spent an entire summer trying to add functionality to an app where some of the customers were already running at max capacity on their self-hosted hardware.

I was trying to explain to my boss how stupid it was for us to spend $100k to avoid a handful of customers having to buy $3k in hardware apiece. It would have been cheaper for us to gift them a box. The opportunity costs of that boondoggle ended up being a huge negative for that team.


I think it's kinda hilarious that the article talks about a psychological effect that hurts software teams, and HN gets into a full-on debate about software testing strategies ;)

I've seen this stuff before. Developers making obviously-bad decisions is usually a symptom of a dysfunctional work culture and a code base that's both vital and fragile. It's not fun to work in such places, but it is possible to fix them.


The main issue is for management to accept that velocity will be significantly reduced and that many issues will pop up along the way and require a substantial QA effort to keep the quality from dipping too low.

It's still the best way forward as a complete rewrite is rarely a good idea.


yeah, I had to explain to a CEO that we weren't going to be shipping any new features for a few months while we upgraded core systems and eliminated tech debt. It was a tough conversation. But about 6 months later we were able to start developing new features again, and everything was much quicker and smoother. The lead dev had regained his composure, too, and wasn't being actively obstructive to everyone who tried to do anything to "his" code base.


So many years ago I was involved in an issue with a BTS (a cellular radio base station) for a telecoms equipment manufacurer. The software was written in C, and had a message parsing architecture between processes, with fixed length queues, which discarded messages when they overloaded.

There were various problems (the subject of a number of war-stories, which I won't repeat now) - but basically anything that wasn't using a properly designed state machine could die horribly when messages started getting discarded.

So we did some work around the fundamental design of the queue handling mechanism. We came up with a software modification which would improve the efficiency of the queues handling, which would stop the queues overloading (which we saw happening occasionally at peak loads on the biggest BTSs).

We could measure various aspects of the queue handling performance, and use queuing theory to prove that the performance would be better at very high loads.

We had a test rig and we could test the relative performance of the new software vs the old software at normal loads.

What we didn't have, was a test rig which could run at the highest loads, so we couldn't test the behaviour of the new algorithms at very high loads.

So the improvements were binned. The existing software was known to fail (badly) at extreme loads. The new software could be tested at reasonable loads, and mathematically be shown to improve matters at very high loads, but because it couldn't be tested, the process said that it must not be deployed!


How was the original version deployed? The definition of ‘very high load’ was different at the time.


I've seen this with people I've worked with that came from those environments. Usually it's a balance between knowing how the current implementation fails is better than not knowing when the new one will.


A possible way to sell instrumentation / metering is to build your own scaffolding, find a problem using it, and sell both the scaffolding and the solution it's provided.

As the article notes, metering by itself simply illuminates problems. For the drowning team (or individual), that simply delivers bad news.

If you can show that the effort actually delivers results, you've got a way to sell the instrumentation as well.

In more pathological cases, you don't reveal the scaffolding, but start cranking out code fixes based on it. You may have the opportunity to sell the metering later, when you're asked how you're finding and fixing the problems.

Rescuing projets from antipatterns is painful at best. Enlightenment is a painful process. It peels away comforting lies to reveal unpleasant truths.


When I was trained in lifesaving, one strategy for panicked victims, who might drown the rescuer, was just to give them some space to drown enough to lose consciousness. Then they were much easier to rescue.


it doesn’t usually apply to drowning victims, but more broadly its applied when someone is choking in general.

if someone is refusing help, you wait until they pass out, and it becomes implied consent. I assume in this example, that would be firing or otherwise being told by your supervisor to take the help.


Consent isn't necessarily the issue in drowning individuals, since the instinctive drowning response can make it dangerous to save them even if they're not about to tell you 'let me drown'. Maybe actually a good comparison for software teams that refuse to make incremental improvements out of fear or panic... it's not that they don't want things to be better or refusing help, necessarily - they're just terrified of the consequences if the improvements go wrong.


>I refactored the code from untested and untestable, to testable with 40% test coverage.

This tells me you think "coverage" is a good metric, which also tells me you're you're not covering a whole lot of cases (edge or not) where the "untested code" functioned as it should.

You're asking to replace something which has potentially some known issues, with something that has a potentially unlimited amount of unknown issues.

And the more experienced person at your job refused? What a tyrant!


I strongly disagree.

The said refactoring was probably not a huge task since he is reporting it done. Larger tasks are most often planned and monitored.

Splitting some class and decoupling some resources + adding the testing framework and implementing a first batch of tests. Of course it might have introduced few regressions.

But

> has potentially some known issues

Read again that. Exactly, none knows how many bugs there are, only how many were found so far. Bug tracking is often lacking, behind schedule, wont fix, etc.

On the contrary modern code analyse tools are frightening good at spotting untested conditions every where (sonarcube comes to my mind). Much better and much reliable than human developer and human tester. Sonarcube does not care if the case can not happen because user interface does not allow the input. The bug is here, and the bug is spotted.

The more experienced person is plain wrong : more tests is better. It's not panacea yet, but it's better.


function int untested_sum(int a, int b) { return a + b; }

function int tested_sum(int a, int b) { return a * b; }

function void test_providing_total_coverage() { assertEqual(tested_sum(2, 2), 4, 'Expected sum to be 4!'); }


Which is why I'm a big fan of testing tools like hypothesis:

    from hypothesis import given, assume
    from hypothesis.strategies import floats
    from math import isnan, isfinite
    
    
    def tested_sum(a, b):
        return a * b
    
    
    @given(a=floats(), b=floats())
    def test_sum(a, b):
        assume(not isnan(a))
        assume(not isnan(b))
        assume(isfinite(a) or isfinite(b))
        assert tested_sum(a, b) == a + b
Running that test gives

    Falsifying example: test_sum(a=0.0, b=1.0)
Note how it also forces me to put a bunch of assumptions in there explicitly, hypothesis is very good at finding weird edge cases. For this example I don't really care that it goes wrong when NaN gets involved, but if I did I discover through testing that that case even exists.


Reminds me of this scene from Frank Herbert's Dune:

“Because of an observation made by my father at the time. He said the drowning man who climbs on your shoulders to save himself is understandable– except when you see it happen in the drawing room.” Paul hesitated just long enough for the banker to see the point coming, then: “And, I should add, except when you see it at the dinner table.”


> I refactored the code from untested and untestable, to testable with 40% test coverage. The senior architect is refusing to merge because the test coverage is to low.

what's likely even more frustrating is that other merges went in for years with no tests or testing in place at all.


Seems like unit tests are only really valuable for library code. Application/glue code... not so much, especially if it is something being actively developed and you are constantly throwing away tons of unit tests.


Totally untrue. Unit tests in strict sense, perhaps, but nobody writes true unit tests - there's a high chance these are actually short range functional or integration tests. Such tests are of high value.


Maybe it just depends on the context. I write embedded linux applications. One application I wrote talks to a Qualcomm modem using their QMI framework and then passes on messages to an antenna subsystem.

I wrote some library functions, which were easily unit testable, then I wrote the glue code, which I found wasn't very testable at all. I found myself having to stub out the entire QMI Framework/modem and the entire Antenna subsystem API since obviously I don't have access to those things at build time. So basically now my application/glue code is being testing in a fake vacuum that may not even reflect the real behaviors of the modem or antenna. So I waste tons of time trying to predict how the modem and antenna might behave, and test my application code accordingly.

Lastly, I end up scrapping a bunch of the tests because the actual behavior of the modem/antenna is different than I predicted and changes slightly with each update.


The weakest link in your software is always you. Automation that can help compensate for human failings always has value. I can't count the number of times writing tests has led me to catch simple mistakes in code that I'd stared at for hours and declared good.

Not to even get into the ways that tests can help guide sensible architecture, especially for junior devs, because "this is hard to test" is the single most straightforward code smell for "this probably needs to be simplified or broken up".


thats the point if that example...

> untested and untestable


The way to establish change is to do it gradually, with buy-in from all parties prior to introducing the change and to make small steps in the beginning. If you showed up with a huge chunk of code that has been refactored I'd probably not merge your change either. The test cases have been written by the same person that did the refactoring, there is a very good chance that any wrong assumptions you made about the code will be present in both.


This allergy to reality is surely the biggest trap not just for startups but people in general. It's surprising but I feel like the norm is for people to care far more about feeling that everything's okay, rather than everything actually being okay.

That said, one problem I see with the anecdotes provided is that the problem solver is doing two things: discovering a problem and solving it. Conflating these two things is a great way to encourage pushback: every change comes with an associated risk, so the general rule is to say no (using whatever flimsy reason) if it's not well understood and agreed that the problem being addressed is a real problem worth solving.


> ... the problem solver is doing two things: discovering a problem and solving it. Conflating these two things is a great way to encourage pushback

This squares with my experience strongly. At a prior job, I remember two or three times where a set of changes was almost rejected for production because despite solving real bugs. The bugs hadn't affected anything in production yet, but the fixes carried the risk of maybe breaking something.

This resulted in some fun times, like a particular client running something like three major versions behind on our software, precisely because management didn't want to accept any disruption risks there, and the old version grumbled along well enough.


> When you add visibility to a system, the numbers are always bad. That’s why you’re putting in the effort to add visibility to an existing system.

This is something everyone needs to internalize. Don't be upset if the numbers are bad when you first instrument. That's WHY you instrumented. Instrumentation is a ratchet to help you improve your product.


I think saying 'they're drowning' is still giving them too much credit. People who built terrible broken systems don't know they are bad at building things, and they probably don't have a clear idea of what would be better or worse to begin with. They don't know it's better to have a system that can be debugged, they don't even know what the difference is. If you give them examples of good designs and bad designs, they can't tell you which is which. When they say 'this is good' or 'this is bad' they will astound you with the reasoning.

Software is an industry with no standards for employment. All it takes to be a Software Engineer is for an incompetent manager to hire you. More than half the people I interview barely have a basic familiarity with writing software. These are people who have been employed as software engineers for years and will certainly find another job. They struggle to write a nested for loop. They write functions with thousands of lines. They just don't know what they are doing.

The author here claims they don't want the visibility of the situation to increase, but I think they don't know it's a problem to begin with. Their stupid excuses for not moving forward are just their stupid thoughts said out loud.


My team has built garbage software at times. It isn't because we don't know how to build good software, and it isn't because we don't care, and it isn't because we don't want to build good software.

It is usually because of a continuous sequence of emergencies to which we are forced to react.

An emergency occurs, and we have requirements to do something in 1/3 the necessary time. We get 80% of the way though that and, before launch, there's a new emergency, and we have to rush to that.

So we end up with crap software that looks like shit and only almost works.

But it almost works enough to keep our very large business humming along. Sure, we hate it, and we're making more technical debt, and over time, we get less effective.

But sometimes we go through cycles of a year or two when the business either can't afford to do it right or isn't willing to pay the price to do it right.

Sometimes, they really are willing to accept paying 10x as much so long as they can pay in a year or two.

(I hate it, but it is true)


Make it 100x as much with 5 times as many people in 10 times as long. Efficiency is never important when there is money to burn and it's not yours, and you won't be blamed when it breaks.

It does not even take any emergency - often it is conscious or subconscious project management strategy.

It's related to observations about Peter principle, Dilbert principle, Gervais principle and office politics and dynamics.

Avoiding this tendency is essentially only possible with small teams with full ownership of the problem they're working on and direct stakes.


    People who built terrible broken systems don't know they
    are bad at building things, and they probably don't have a
    clear idea of what would be better or worse to begin with.
Sometimes this is true, sometimes not.

The early days of a company (or product) can be full of existential threats and rapidly changing requirements. Resources are always limited and dev teams are always stretched beyond their limits. The crappy code you see may be there because it was a literal choice between that duct-tape solution you're looking at, and missing out on some major deal or perhaps even going out of business.

Now, yes, the people who built such a rickety system can become obstacles to improvement. But, they are not necessarily incompetent. They may have been forced into making the least-bad decision out of several terrible options, over and over again.

Sometimes they're just pretty bad developers though.


So being the person that’s manning the lifeboat, what do you do? Kick them down?

If you do nothing everyone will surely drown.


Could it be a competence barrier to protect incompetence? I have experienced so many similar experiences, normally with middle managers who are terrified they will get caught out for not knowing what they are doing. I noticed they always stick with brands like Microsoft or SAP are deeply suspicious of open source. Oh and they dont drown, they get promoted. Could this be an opportunity for a startup to identify this risk "as a service"


Between sabotage and people wanting to sink the lifeboat, I have a third option: These people have a warped perception of what's happening and priorities around what's happening. Warped by their own irrational beliefs which have come about as a result of cognitive dissonance reduction. Initially, they can see how messy & bad things are but since they, themselves, are responsible for things having gotten to that point, they start bending their own beliefs in a way that will make themselves and their past actions look good. In the final phase of this psychological process, some kind of Stockholm syndrome kicks in in relation to the poor standard of engineering that is making their lives miserable: They start to actively seek out poor engineering because it's the only mode of existence that their psyche will allow them to fathom.

Warstory time.

A tech company I worked at was being encumbered by a codebase with poor code quality. Bugs were being introduced all the time, with no chance of preventing it because nothing is testable/tested. Bugs were introduced at a faster rate than they were fixed. The whole system was a timebomb because it was built in a dying ecosystem. For certain kinds of code changes, their effect couldn't even be made visible at development-time at all. You had to release them into the production system and look at metrics to see their effect, hoping that nothing would blow up, so the system was layers and layers of code that was trying to be minimally-invasive with respect to existing code.

Whenever an engineer would be newly hired, they would, within their first week of working there, suggest "Let's take a month to refactor this/that part of the system". When an oldtimer would hear the new guy say "refactor", their warped perception would hear "accomplish nothing" or "threaten our mode of existence". They would give the following speech: "Hey, get it in your head, noob: We don't do that here. This company is not focused on code quality, it's focused on a mission. Every week we come to work and we accomplish a part of a mission. I KNOW the code is probably the worst piece of shit you've ever seen in your 20 years. [This company only hires extremely experienced people]. The engineers who work here are all on board with our mission-driven culture and pride themselves about their skill at dealing with the shitty codebase. Your suggestion tells me that you can't deal with it, and you are unwilling to absorb our mission-focused culture. You will not get a career out of working here, unless you properly absorb that element of our culture".

So, in their mind, they'd taken "shitty codebase" which is a negative, and turned it around and made it into a religion about a "mission-driven" culture which is a positive. [Cognitive dissonance reduction at work, right there]. The reality was, of course, the poor code quality was a huge drag on this company's ability to accomplish a mission of any kind: Because changes that took their competitors a day, took them a week. Changes that their competitors did in a month, would have been a suicide mission for them and they therefore rationalized their way out of even wanting to do them. ...but I wouldn't recommend to the newly hired engineer in the story to mention that particular elephant that was in the room.

Now, here comes the really sick part, where the picture of the drowning man falls apart: From time to time, due to a rare alignment of circumstances, this company might be forced into a situation of starting an implementation of a new element of the system with a clean slate, i.e. completely unencumbered by the current codebase. They would intentionally implement the new codebase to be just as shitty as the legacy code. They would implement it in the dying ecosystem. They would generously insert unreachable code. They would pursue the fastest path towards making it untestable.

Now, under normal circumstances, if I hear someone telling that story, I would suppose there might be a motive for this. For example, I've seen developers in hedge funds intentionally rig a system to explode if they should ever be forced out of the company and be as hard as possible for anyone else to fix, for obvious reasons having to do with their incentivization structure. But that's not what was happening here. They had a culture of openly sharing knowledge of the system, were hiring new people all the time, and working hard to bring them up to speed about the system's working etc, so it wasn't that.

My guess is that it must be the "Stockholm syndrome" element I mentioned: They started to develop a psychological attachment to a poor standard of engineering. It was making their lives miserable. But it was also the only existence they knew and they felt was available to them, so cognitive-dissonance reduction was maneuvering their psyches into actually favoring that mode of existence.

Some might have said things like: "One day, after we've grown enough, we will do a complete rewrite." Or "Any day now, we might be acquired by another company and will leave all of this behind". But it was more in the vein as when a religious persons speaks of a glorious afterlife.

There is a lot of really sick shit that the human psyche is capable of, and sometimes it surfaces in engineering. I think that's my takeaway from all this.


I have so many war stories there's not enough space on HN to contain them. I'll just state my overall experience. I write elegant well designed code, but generally speaking you won't find that anywhere especially if you're working in teams. Unless the code is all written by one person who had a chance to design and implement it well solely by themselves with adequate time, it's going to be a mishmash of they just didn't have time and the pressure was on.

I'm of the mindset that if it works, it's good code regardless if it isn't. This I think is the healthiest mindset to have in any team. I will refactor things slowly to clean up issues and of course my own code is well thought out and integrated into the codebase regardless how reckless anyone else is being. Any other way of interacting with a team would be antisocial and not beneficial ultimately. The way I see it is they wouldn't be paying me if there wasn't more work to do. The most important thing is that it works and the customer is happy. Your disappointment with elegance or coverage comes second to that.

The thing that actually annoys me is when, having exceeded my own delivery deadlines and everyone else's if I provide value added like a new feature that puts us far ahead of our competition I get pushback. Management doesn't want to have to maintain it. Even though my code is really solid compared to the rest of the garbage going in there. They just see me as a plumber fixing a leaky pipe and don't want anything further. I get that there's ROI and whatnot but it still bothers me when I show them some next level shit and I'm told to mothball it. Especially when I'm the reason we're usually way ahead of schedule. </rant>


On a separate note: I don't think that what I described above should be referred to as a point on "elegance". Because "elegance" suggests that it's about aesthetics. And regarding aesthetic considerations it is usually the case that people can reasonably disagree about them, and that there is no objective argument that can be made either way. But the arguments I made above about how a bad codebase quality can wreak serious havoc with a company's/team's ability to execute on their mission is something that goes far beyond a failure in elegance. It's a liability. And when I say "liability" then I don't mean in any abstract sense, but in the sense as any financial accountant would understand the term.


I've had that same thing happen to me numerous times too. I wrote code that demonstrably and undisputedly added to the value of the product in a way that went far beyond anybody's expectation. I took no longer to do it than it would have taken to do it the "normal" way. People get mad at me making the argument "Well if you leave the company, somebody else is going to have to understand your next-level shit, and we just simply don't have people here who can do that. So then what?"


I think it depends on the details. If your code is simply cleaner, well documented etc, then fully agreed.

If you introduce new dependencies, or the latest free monad transforming framework when there was none, then I’d be wary as well.


I don't do "framework" ;-) In fact I'm usually the first to protest when somebody tries to introduce frameworks or other bloatware. And getting excited about algebraic structures is something I got out of my system while doing my PhD. I'm strictly talking: Value-add to the product, albeit at the cost of higher complexity to the extent that it is unavoidable.


This is one of those things you don’t truly understand until you’ve experienced it first hand. I have. Very appropriate metaphors.


In my world (marketing, not dev) the saying is "you cannot optimise something you do not measure".


"I think that is not sabotage, but I don't have any real data that support or dismiss my statement" would be a better title.

Your mileage may vary.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: