The Servers Are Burning

bunderbunder · on Sept 6, 2018

> in order to write effective tests, a programmer had to know all of the ways that a piece of software could fail in order to write tests for those cases

No.

In order to write effective tests, a programmer has to think of the piece of software's entire input domain, carve it up into a set of equivalence classes, and then determine what the expected behavior should be for a piece of input from each of those classes. With more careful testing at their boundaries, so that you can root out edge and corner cases.

The results of those tests will then find the ways your software can fail for you.

Thinking about it this way, it starts to become clear that a huge part of your job is finding ways to simplify that input space as much as possible. Simplifying at this stage makes getting it right so much easier. The fewer special, edge, and corner cases you allow in the first place, the fewer you have to code for, the fewer you have to test for, the fewer you might accidentally miss, and the fewer others might accidentally run afoul of.

This, incidentally, is the key reason why it's good to take some time to define all your tests before you start writing your implementation. I'm not necessarily a huge advocate for "red green refactor", but I do think that at least identifying all your test cases before you ever start implementing can potentially save you boatloads of time, by helping you recognize opportunities to simplify your design before you get locked into an unnecessarily complicated one by a couple hours or days (or months) of sunk cost.

It's also the real reason (IMO) why functional programming - as a style, not a kind of language - is such a valuable discipline. The challenge with programming in an imperative style is that it turns your module's entire past history into one of the inputs you need to consider, and that turns your test and specification surface into something that is just so much bigger. Considering that it opens up the possibility of injecting bugs into a routine without ever actually editing the routine itself, or even any of the functions it calls, it's possibly even fair to say that it's unbounded.

zbentley · on Sept 6, 2018

> In order to write effective tests, a programmer has to think of the piece of software's entire input domain, carve it up into a set of equivalence classes, and then determine what the expected behavior should be for a piece of input from each of those classes.

OK, so let's revise the statement to "in order to write effective tests, you must know how your software should behave in the face of all kinds of inputs".

In my mind, that's an equivalent, or nearly equivalent* proposition. It's still above and beyond the capabilities or available resources of people developing very complex software. That's not excusing anyone. Just saying that your correction isn't that different from the original statement.

* Sure, "someone unplugged the computer while it was running my code" is a failure mode that's not a function of (what most people would consider to be) the input domain. Genuinely external failures like that, while damaging, are rare. What you usually end up defending against are the downstream results of those failures (i.e. "a network partition meant that half of my input dataset was null pointers").

bunderbunder · on Sept 7, 2018

Well, that's the idea behind breaking it into equivalence classes. Even if the input space is infinite - say, the set of all tuples of two natural numbers and a string - you should be able to determine that the domain can be carved into only a relatively small number of sets that are interesting.

If you find you can't, well, that's what's great about figuring out the test cases first - because now you know that you've got something complicated, and you have a chance to figure out how to make it simpler.

Then, ideally, you don't have to even worry about coming up with a mess of individual test cases. You arm yourself with a good QuickCheck style testing framework, so that all you have to do is write down your invariants, your equivalence classes, and what sort of output to expect for each one, and you're mostly done. Just a few hand-coded unit tests to explicitly call out the edge cases, stuff like that.

Obviously this s a much bigger job if you're limiting yourself to hand-coding an explicit test method for every single test case. Don't do that. There are so many higher-leverage tools - data-driven tests, property tests, fuzz tests, approval tests, etc. - to use in addition to the basics. It's very much worthwhile to get good at them. Even if you don't always use them, learning how to is good practice for learning how to carve a big problem up into manageable pieces.

pjc50 · on Sept 6, 2018

> think of the piece of software's entire input domain, carve it up into a set of equivalence classes, and then determine what the expected behavior should be for a piece of input from each of those classes

This is an extremely good way of putting it. Doing some of this at the requirements stage if possible can be very helpful too, because you can detect nonsensical or conflicting requirements before you even write any code.

bunderbunder · on Sept 6, 2018

I'm rather fond of a book, Specification by Example by Gojko Adzic, that's partially a manifesto for, and partially a guide to, doing this.

specialist · on Sept 7, 2018

Thanks for the pro tip. Already reading...

I've been working on a notion tentatively called "example driven development". Use specifications to generate code. Kinda like compiling swagger docs to implement an HTTP server.

My first such effort was using the BA developed HL7 interface documents (swagger for HL7) to code generate our implementations. Worked great.

chrisweekly · on Sept 6, 2018

Thanks for the rec! Just bought for $5 (kindle ed).

shady-lady · on Sept 6, 2018

what site? i can only see manning for $36. not on amazons.

kbenson · on Sept 7, 2018

Here[1] it is on Amazon, which is cheaper, and seems to have used copies for ~$18, but I can't find anything for $5. Maybe it was a typo?

1: https://www.amazon.com/Specification-Example-Successful-Deli...

berti · on Sept 7, 2018

I'm guessing this one (with a slightly different title: Bridging the Communication Gap: Specification by Example and Agile Acceptance Testing) http://a.co/d/6tGlrUc

chrisweekly · on Sept 8, 2018

Yes, that's the $4.99 one I found.

http://a.co/7gJYjIx

TeMPOraL · on Sept 6, 2018

Seconded. I haven't seen it summarized so well before, and the sentence you quoted is the very one that made me favourite that comment.

specialist · on Sept 7, 2018

Metaphorically, this is functional style programming, mapping inputs to outputs, minimize state and side effects.

His conclusion is the money quote for me:

"...your job is finding ways to simplify that input space as much as possible."

I do this. Instinctively? I feel more or less alone in this practice.

My prime motivator is to do less work. Less is more. Concision is a virtue. Blah, blah, blah.

The trouble with my strategy is that simple is hard. Much harder than kanban themed JIRA mediated velocity maximizing slapdashery. Also harder to measure.

So compared to my peers, it looks like I'm moving in slow motion.

To illustrate, someone who fixes the same code pathway multiple times is scored higher than someone (like me) who fixes it (mostly) right the first time.

The tortoise vs the hare.

--

"If I Had More Time, I Would Have Written a Shorter Letter"

  -- Blaise Pascal https://quoteinvestigator.com/2012/04/28/shorter-letter/

"-2000 Lines of Code" http://www.folklore.org/StoryView.py?story=Negative_2000_Lin...

--

Aside: I was a QA/Test manager for a while. I really cared about this stuff, back in the day. Now it seems no one cares.

skybrian · on Sept 6, 2018

These are great techniques and should be done more. But I'll add a caveat: they don't cover all the ways software can fail and we probably shouldn't pretend they do.

The environment in which the software runs is also an input, and you often don't get to pick that. Consider new browser releases, browser addons, and browser bugs.

Performance is a global property of a system that crosses most abstraction boundaries. Outside hard real-time systems, we get few performance guarantees.

Some domains are simple on one level and very complicated on another. The set of all <=280 character strings may seem simple as far as the data goes, but setting policy for Twitter is not.

Still, being able to decompose problems and treat bugs as either functionality or performance (and not a complicated mixture of both) is pretty helpful.

wdewind · on Sept 6, 2018

> has to think of the piece of software's entire input domain

If you're a startup this is not clear and changes frequently. Practicing TDD in this environment has very real downsides that exist less in mature companies with clearer understanding of business requirements.

rstuart4133 · on Sept 7, 2018

My philosophy is a little different. It grew out of writing a thesis on software quality in a post grad course. We studied lots and lots of ways to improve software quality. I can't count the number of papers I read on the subject. The first paper you read is all gung-ho about their particular method of course so you think you are done - you've found the holly grail. Then you find the second paper is equally gung-ho about an entirely different method. Eventually you become jaded by what is really opinions on the effectiveness of the technique their paper explores (dare I say delivered with the almost religious zeal of a true believer) disguised by blizzard of equations, tables and lemma's that take hours to decipher.

Being a post grad thing I had been writing software for many years before reading these bloody things. I did learn a whole pile of new techniques (my thesis got a high distinction with the comment "you sure know a lot about improving software quality"), but for most of it I was none the wiser on what I should apply to my everyday work, and that was the point of doing the post grad in the first place. So I changed tack, and started reading papers that reports the results in productivity. It's really hard to measure of course, and these papers were very thin on the ground and their figures involved a lot of hand waving. Even so I was pretty confident in the end the _only_ technique that had a significant effect on productivity was unit testing.

Note that considering quality and productivity together is very different from looking at quality alone. The most effective way of reducing defects per line is the method is probably the one developed by IBM when developing code for NASA's early space program. It relies on heavily on formal inspection. In that system programmers weren't allowed to compile their own code - and compile errors where counted in their defect rates. It was spectacularly effective at developing software with low defect rates. Programmers don't usually think of their profession of being shining example of the quality engineers of all types can achieve when they put there minds to it - but this code was just that. 100's of thousands of lines of real time code, not a single failure in flight. Contrast that to the efforts from other engineering disciplines like the O rings, tiles falling off, yada, yada.

But most of us don't develop code that must work on first deployment, with catastrophic life and death billion dollar consequences if it doesn't. In the trenches where most of us spend out working lives unit testing delivers more bang per buck then any other sort of quality measure. The reason is simple - it automates testing that would otherwise have to be done manually. Automates it so effectively that it's almost free. If you are wondering how that could have effect net productivity just read the story - the entire team stops work for days to rescue a burning site, people too scared to deploy code. But that's probably dwarfed by the lost sales while the site was down.

Still, writing tests take time. Writing too few doesn't get you the quality boost. Writing too many slams productivity. How many tests to write is a puzzle. Your post about all the tests you could write to test everything is certainly pertinent - but if you go down that rabbit hole too far you will spend all your time writing tests.

My way of deciding that point isn't like yours. In the end I decided the most important thing an easy way to add test for every line of code. The only proof I can believe is a coverage tool reporting 100% code coverage (or better yet 100% branch coverage if your tool supports it). Handing waving to explain the rest could be right, but to verify it requires a 2nd person to read code _every time it is modified_. Remember we are on about maximising productivity here, so creating manual work like this for every release is really hard to justify. Thus the first step is to write tests for your code until you have 100% code coverage.

But it goes further - it must be really easy to find what test covers a particular line of code. This usually means the same line will be tested multiple times - once by the test that targets it, and many other times by tests that happen to use it. Someone programmers view this is inefficient, and it is in a way, but you will see later there is a reason to spend this extra effort. Sadly it is time consuming to verify this has been done in every case as you would have to check every line of the code being tested, but it is amenable to auditing: if you select 10 random lines and they are all good the odds another random line doesn't have an easy to find test is 0.1024%.

As anyone who has written a lot of tests knows, this insistence on 100% test coverage has flow on effects to the code being tested. Arranging code so it is easy to spot how you get to the line you want to test can be the most tedious step. It's not unusual for the "writing tests" step to force the code being tested to be refactored for visibility. It is this "refactor your code so it is easy for others that code later to add more tests" that is real goal of the 100% test coverage condition.

Once they've hit 100% then, and only then, can the programmer take into consideration the sorts of things you describe. It's difficult to check if they have done that of course - at doing such checks really requires 2 people to think up the tests required, which doubles the work. But perhaps it doesn't matter as at lot will happen anyway if your programmers are diligent in developing test for every subsequent bug you hit. Here diligent means thinking about all the other places this bug may arise, and testing it there too. Relying on this is effectively leveraging off the all the systems testing that follows to produce the sort of tests you describe automagically.

Adding these new tests is easy of course because you already have 100% test coverage, and its obvious what code tests each line. You can ensure adding these new tests happens by insisting every programmer report every bug they find in their own library or someone else's into a bug database, and later record the test.

Doing it this way does not guarantee your code will have a 0% defect rate. But then nothing, not even what NASA did can do that. What it does guarantee is over time your defect rate will drop, and it will do that with minimal extra work. I find the "extra work" bit is important, because when you are in the "creating the new shiny" mode finding the energy and money to do the work isn't so hard. Hopes should be high is the next big thing that hits the jackpot. But once you are in maintenance mode reality has hit home. Every new addition has to justify every cent spent on it. I suspect the that is because existing KPI's are in place, everybody has a reasonable feel for the impact a change might make - and most changes are small increments. Yet over the years most development happens via the cumulative weight of these small increments. Ensuring the cost of maintaining 100% test coverage during these small incremental changes is minimal is only hope you have of ensuring it's done.

annywhey · on Sept 7, 2018

Although I wouldn't argue against unit testing(I use it too), I'm a little more hesitant about incentivizing test quantity. Many times when this comes up having been tried it gets attached to a anecdote that goes: "And so we wrote very bad tests and atomized the codebase into unmaintainable pasta code that could be trivially tested to make the number go up."

Which leads me to a hypothesis that there's some other quantifiable out there that would work better than unit test coverage: a "diversity test" metric. The software artifacts produced by writing code and after a build are not "the software" in the sense of the whole project. Documentation, support, and so on are also "the software." Logging and profiling tools enable debugging methods other than "stare at the code very intensely". If you start from a holistic standpoint and assume quality requires multiple angles of inspection then your choices of technique broaden substantially.

If you changed a mandate of "must test all code" to a mandate of "must either test, log, profile, or document all code" you would add leeway for the corners that are test-averse to be documented or logged instead, and vice-versa. The combination gives you a palette of feedback: if it's hard to do any of those things, you have bad code. Ideally you can get all of it, but that is unlikely to be the norm for most organizations most of the time. But more bugs will be caught by this web of feedback than if you mandate any one of those methods alone.

davedx · on Sept 7, 2018

I agree with this, and I think software quality is like security. Multiple layers are better, and orthogonal layers are even better still.

A good logging system is a great example. To be effective, your logging should be going into something like Kibana or Grafana or both. Logging levels should be correct. Errors should be easily visible and have useful stack traces; all other noise should be easily filtered out. Alerting should be in place for any crashes, ideally going directly into a bug tracking system. (Sentry.io has a nice story for this on the front end).

This ties in well with the "let it crash" philosophy. You just cannot anticipate all the crazy inputs that will go into the typical software system of today. You can't. Most systems I work with have multiple external dependencies, each with their own schemas, data workflows, deployment processes and so on. Unit testing won't be as valuable with that stuff. Full end-to-end testing will be more useful.

Look at all your options as a whole and allocate developer resources according to the value of each.

And personally, I don't think the OkCupid model is as bad as it sounds... ;)

gred · on Sept 6, 2018

> My first employer, the online dating site OkCupid, didn’t harp on testing either. In part, this was because our company was so small. Only seven fellow engineers and I maintained all the code running on our servers—and making tests work was time-consuming and error-prone. “We can’t sacrifice forward momentum for technical debt,” then-CEO Mike Maxim told me, referring to the cost of engineers building behind-the-scenes tech instead of user-facing features. “Users don’t care.” He thought of testing frameworks as somewhat academic, more lofty than practical.

I found this paragraph interesting because my experience with my one-man side project is the opposite: I'm so small, and time is so scarce (I have a separate full-time job), that I can't afford to fix bugs on somebody else's schedule when they find them... instead I spend a large amount of time writing tests up front for new features so that I find the bugs on my schedule (i.e. late at night when I don't have other commitments).

spamizbad · on Sept 6, 2018

This is a smart way to think of testing and will discover bugs when your understanding of the code is the most lucid.

But it does require some nuance. Unfortunately, some people think writing tests or test harnesses are wasteful because they think of shops that insist on 90%+ coverage and have test suites that are easily 5-10x the size of the application's codebase.... so they decide to forgo them completely. When in reality, the answer is probably something like: you have at least 25% of your codebase that would greatly benefit from testing, even if they're just "smoke tests"

sidstling · on Sept 7, 2018

Don’t you take this bit about out of context? If the premise the article builds around faulty tests is correct, then stuff like unit-testing is mostly a waste of time since you’ll still need to fix things when the servers are burning.

We’ve done TDD, projects with coverage only vital parts and no software-tests where I work and there is no difference in production on smaller projects. Maybe we’re terrible at writing tests, but if so, then it’s still something we have to deal with.

So I think it’s hard to outline the correct amount of testing for every case.

I don’t think I would want to work on a major system with multiple contributors if there weren’t automatic tests, so it’s not like I’m against testing either, I just don’t think it’s always a holy grail.

tinco · on Sept 7, 2018

I think the advantages of tests, TDD or otherwise, are not easily measured. Everyone knows cowboy coding up small projects works and through manual testing they can quickly be deployed with reasonable quality, especially if the developers are experienced.

The tests shine in the follow-up phases, when code needs to be changed, when inexperienced hands have to touch the code base, and when there's simply too much going on to fit in one mind.

If you build your product with the future in mind, then you write tests. If you've got VC money and you need to sprint to some goal before the puck gets in front of your stick, then I can see the logic in skipping tests.

If the tech debt hits you before you reach that point though, you're in deep shit. It might also be the reason that innovative companies often seem to technologically stall right after hitting mainstream.

toss1 · on Sept 6, 2018

Yup, classic.

"...no time/resources to do it right in the first place, but plenty of time/resources to fix it when the customers complain..."

I always thought it better to find the bugs in-house before shipping, but so many others don't see it...

megablast · on Sept 7, 2018

You are acting like most bugs aren't already caught in the dev process. No on writes code and just deploys.

Writing extra tests that cover everything for a small project that you are working on yourself is a waste of time.

LandR · on Sept 7, 2018

> No on writes code and just deploys.

Oh yes they do!

Hell, I've seen dlls copied from a server, disassambled into code, code changed, rebuilt and copied back to server....

Most devs here write code then copy their bin folder to a production server and deploy it.

rhencke · on Sept 7, 2018

> No on writes code and just deploys.

I wish this was true.

I stopped keeping track of how many times I've seen people break stuff because they deploy code that simply doesn't work, because they figured they could just eyeball it rather than run it and test it out.

Not everyone cares deeply about the quality of their work.

jkubicek · on Sept 7, 2018

Writing tests for small, personal projects is hugely valuable for me. Running a well-written test lets me iterate much faster than running the entire script or booting up the whole app and poking at the UI.

aprdm · on Sept 7, 2018

> No on writes code and just deploys.

You would be - very - surprised how some big non tech companies (FANG) run.

nitrogen · on Sept 6, 2018

The logic kind of makes sense if you consider that customers will only find a small fraction of the bugs that your team will. Sometimes the cost of removing all bugs is greater than the cost of losing customers to the few unlucky bugs that get found.

But you really need experienced people deciding where that tradeoff lies for each company and project.

toss1 · on Sept 10, 2018

Yes, each individual customer is likely to find only a fraction of the bugs that your team finds.

But, many customers will collectively find much more.

the cost is more than an individual customer. It is everyone that customer comes into contact with and spreads word of the grief your bugs caused them.

Reputation can be very fragile.

The "deciding where the tradeoff lies" can easily lead to the Ford Pinto fiasco, where it was decided that the costs of the few lawsuits were less than the costs of the fix. After they'd _killed_ dozens of people and then got massive fines, that decision didn't look so good.

Quality is its own excuse. And yes, you cannot let the Perfect be the enemy of the Good and insist in fixing every nit, but Good should be Damn Good, not merely 'just ok, ship it infested with known bugs'.

You are supposed to make people's lives easier with a product, not merely extract their money.

cs02rm0 · on Sept 7, 2018

I think that's the bottom line.

I've done virtually zero unit testing on a project before when there was barely time to write the core code. There wasn't much business logic, lots of I/O and no user inputs or browsers to worry about and I was leading a small team of four so it was relatively well suited to it. Still, I was nervous as it was quite high profile and unit test coverage was about the only metric management knew to ask for.

It was replacing a critical third party system being turned off on a fixed date and we barely made it. 6 months after being switched on in production we'd had zero bugs reported and I moved on. I'm confident in hindsight that I made the right call for the circumstances, but having it funded properly would have made me sweat a bit less.

mikestew · on Sept 6, 2018

(then-CEO Mike Maxim) thought of testing frameworks as somewhat academic, more lofty than practical. ... Mike the CEO, who was also OkCupid’s best engineer...

Mystery of the Melting Servers, solved.

plopz · on Sept 6, 2018

Do testing frameworks normally catch memory leaks?

yashap · on Sept 6, 2018

If you look at this quote one the article:

“That same story happened so many different times,” my old boss David told me. “Someone launched a small, relatively innocuous change that did one of the millions of unexpected things it could have done, which then happened to break some part of the site, and then bring it all down—sometimes bring it down to the point where we couldn’t recover it for hours.”

That’s exactly the problem that thorough automated test suites try to solve. When you have a robust mix of unit, integration, end-to-end and performance tests, you can greatly reduce the number of unintended regressions that make it into production. Not eliminate them completely, but definitely reduce them a tonne.

mikestew · on Sept 6, 2018

Mine do.

fhood · on Sept 6, 2018

You can't possibly say that with any confidence. I literally just finished reproducing a memory leak caused by an insane combination of circumstances. This was not something valgrind or any similar tool would find. It was not something that any rational human being would have thought to write tests for.

Once code is no longer synchronous, and tasks are being juggled around and swapped between, it becomes virtually impossible to predict what might cause this sort of bug.

mikestew · on Sept 6, 2018

You can't possibly say that with any confidence

In answer to the original question, I stand behind my answer with 100% confidence. The test infrastructures I build catch memory leaks all of time, and I do not consider it abnormal for them to do so. If you mean to say that I cannot be confident that I have caught all of the memory leaks, well duh, of course I can't. But that's not what was asked, and that's not what I said.

To expand on what was actually asked, though, whether a framework catches all, some, or no memory leaks is irrelevant. Because if a team is at least testing for it, I'll bet they're just a bit more rigorous in their coding than a team that just deploys to production and waits to see what breaks.

dawidw · on Sept 7, 2018

> The test infrastructures I build catch memory leaks all of time [...]

Could you elaborate on that, please? I'm curious. Thank you in advance!

mikestew · on Sept 7, 2018

It would be very specific to the project. For instance, if it's an iOS/Mac project, Xcode has nice profiling tools to catch memory leaks. And of course clang's static analyzer as well. What I'm working on now is embedded Linux, so valgrind and some test suite to exercise the code.

But in general, no matter the project, you need a few key pieces:

1. A test suite that thoroughly exercises the code. You'll need this for...

2. A tool such as valgrind to watch memory as your tests in #1 run. Build time, ad hoc, stick in the pipeline anywhere you like.

3. A static analyzer to take a first pass at the code and say, "hmm, this might leak/crash/kill puppies." So, yeah, the order is off as this should be the first thing you do. It can run on your dev machine or on the build machine. Expect lots of false positives.

(optional) 4. A code coverage tool to make sure you're not missing key pieces of code that need exercise.

kbenson · on Sept 6, 2018

Tests only catch what you test for. In this case "memory leak" can be replaced with "site functions as expected", and the same argument you made applies, and yet it's still useful.

Just because you can't determine the exact cause of the leak, or even detect all of them, doesn't mean that having some indication of memory usage initially and under test loads of different (reproducible) amounts at different intervals and comparing them across builds can't get you a hell of a lot of utility.

chrisweekly · on Sept 6, 2018

> "Tests only catch what you test for."

Er, sort of. Integration and e2e tests often catch problems nobody specifically, explicitly envisioned. That's part of what makes them so valuable! Right?

https://m.youtube.com/watch?v=0GypdsJulKE

kbenson · on Sept 7, 2018

Sure. I think we're just working on slightly different meanings of what I said, which wasn't meant in the exact way it's often rolled out as a criticism. Part of the reason you do different, vague tests is the hope you'll catch some odd errors someone missed. So you are testing, vaguely and imperfectly but over a much wider area, in the hopes of finding things you can't specifically think of.

Sort of like if you are planning to drive some car you recently did a bunch of work yourself to on a long trip. You've probably already tested all the things you think might be wrong, but that doesn't mean it's not worth taking it for a drive for an hour around town and on the freeway to see if the unexpected happens, so you can deal with it close to home. You're testing the car, just in a vague "shake it and see what falls out" way. You are specifically testing for that, it's just not guaranteed to find any or all problems. It's still better than nothing most times.

whatshisface · on Sept 6, 2018

The parent is saying that their tests normally catch memory leaks, not that their tests catch all memory leaks.

bunderbunder · on Sept 6, 2018

> You can't possibly say that with any confidence.

Assuming your module's interface isn't too complicated, it shouldn't be terribly difficult to write a test fixture that fuzzes a module, and verifies that its memory consumption does not go outside a certain bound.

The first round of bugs you'll catch might be design flaws more than true memory leaks - places where the module isn't designed to limit its own memory usage in the face of adverse (or adversarial) input.

Get those pinned down, and then you can use it to catch memory leaks with confidence. Let it run for less time if you're only worried about catching fast leaks, and for more time if you need to catch the slow ones, too.

tekstar · on Sept 6, 2018

What's an example of a memory leak that can't be found by valgrind?

viraptor · on Sept 7, 2018

One happening in an embedded language with its own GC. Unless you can switch all allocations to explicit management for the tests, it's almost impossible to differentiate between live objects, free lists, cycles with deferred collection, and arenas that are supposed to be freed on exit.

spc476 · on Sept 7, 2018

That happened with me at work. A certain class of error would leak references and over time they would pile up. Fortunately, Lua (the embedded language in use) allows one to mark certain references as weak references and allow collection. The trick was in determining what to mark as weak references.

selimthegrim · on Sept 6, 2018

Interested to see an AAR

aLifeBot · on Sept 6, 2018

Which framework do you use?

flukus · on Sept 7, 2018

Not OP but make. It's not exactly a framework but I've got a project where every test get's run twice, first as a test to check the functionality, then run under valgrind.

It won't catch all possible errors but it will catch most. On the plus side it will only nag about actual errors, not potential errors like "safe languages".

1_800_UNICORN · on Sept 6, 2018

If you're an early-stage startup that's still determining product-market fit, then it's completely fair to eschew tests and accumulate tech debt; you're at a higher risk of running out of runway before you've even proven that your business works.

As soon as you have enough users that an outage poses a significant risk to your business, you need to invest in either refactoring to reduce tech debt, or a rewrite. And you NEED to add tests, and a robust deployment process that tests changes at multiple stages with monitoring on the basics (server speed and memory usage, database error rate, etc).

OKCupid got lucky in this case.

holmberd · on Sept 6, 2018

Easy to say, but unfortunately rarely seen in the wild, depending on the experience of the project manager. Once that product is live and climbing, no owner wants to hear talks about slowing down for reasons of better test coverage or refactoring.

usefulcat · on Sept 7, 2018

I think the point is more that the slowing down is going to happen anyway if production systems are repeatedly catching on fire and people are afraid to make changes. At that point the difference is how having or not having tests will affect things in the longer term.

holmberd · on Sept 7, 2018

Indeed, but prototypes running in production or running without tests, is a symptom of other underlying problems in the organization. I feel that scope creep and lack of testing goes hand in hand for projects that are poorly managed.

nradov · on Sept 6, 2018

Skipping tests generally doesn't allow for creating a working MVP any faster. It's just an illusion.

always_good · on Sept 7, 2018

Only if you are referring to perfect tests which cost zero time to write/maintain and only the tests written that eventually will catch an issue.

Otherwise, tests have a cost just like any other code. You can see this by looking at both extremes: perfect tests (described above) and useless tests. For example, tests that make your codebase too brittle, tests that don't actually test anything useful, spending too much time writing tests, spending a lot of time writing tests for precisely the code with high disposability, tests that are too coupled with the code, stupid tests that should be removed but won't be because tests tend to be append-only, etc.

Testing is an advanced topic where every +1 unit you spend on testing doesn't mean your application is now +1 unit more robust. +1 unit of time spent in tests can even mean your application is -2 units worse because testing is a trade-off.

So tests are more than capable of hamstringing your unlaunched MVP. They're also capable of being the reason you launched your MVP sooner than later. But the costs of testing are no illusion.

nradov · on Sept 7, 2018

Tests save you time. They have a negative cost.

leoedin · on Sept 7, 2018

Some tests save you time. Some tests don't. It's perfectly possible to write a huge, fragile, heavily coupled test suite which takes days or weeks to modify when you make even a tiny change to the code it's testing.

Just as code falls on a spectrum between clean and completely unmaintainable, so do tests.

sonofaplum · on Sept 7, 2018

Test save you time (maybe) in the long run. But in the search for product market fit, you may conceptualize, design, build, release and then scrap a feature within the course of a month. That's not a long enough lifespan for the tests ROI (better code, fewer bugs, better architecture) to be positive.

spc476 · on Sept 7, 2018

The regression test [1] for our legacy application takes five hours to run. It takes five hours to run because it also needs to check every log message (via syslog) to ensure every transaction happened.

I can't say it saves any time, and it certainly does not have a negative cost.

[1] There are no unit tests, as there are no real "units" to test. It's a legacy code base of C and C++, using a proprietary library that no one left in the company has much experience with. Said proprietary library is considered "legacy" by the company that owns it. I've seen the code. No wonder they consider it "legacy" (it started life in the mid-80s and god does it show).

matt2000 · on Sept 6, 2018

When making something new where you're not sure of the value yet, I've found that you can get 80% of the benefits of unit tests with around 20% of the tests you'd write to get "full coverage."

My main goal is to at least have the code run in an expected way and produce an expected result. This doesn't catch everything, but it does seem to catch enough problems to be worth it for the time invested.

Edit: I should mention that I also add tests to cover something when I experience a failure, so it at least won't happen again.

ddebernardy · on Sept 6, 2018

The big problem I've noticed when reading unit tests in various projects is that they frequently test the implementation rather than the outcome:

https://softwareengineering.stackexchange.com/a/304910/24932

matt2000 · on Sept 6, 2018

I still find there's a fair amount of value in just having most of the code be executed regularly (on every commit, for example), even if you're not perfectly testing the output.

macintux · on Sept 6, 2018

That's one reason I really like functional programming: it's easier to test pure functions, vs code with side effects that requires you to look at the internal state afterwards.

tobr · on Sept 6, 2018

On the other hand, pure functions are also much easier to reason about, thus much less likely to cause bugs from "a tiny change" in the first place. It's in the messy imperative stuff things tend to break in incomprehensible ways, and where you really need tests.

jerkstate · on Sept 6, 2018

I'm not a great programmer, I can't usually write code that does exactly what I expect the first time, so I use unit tests with a debugger in my IDE to run small portions of my new code until it works the way I envisioned it before I started writing it.

This is a lot faster than my old method of writing code that doesn't quite work, and running the whole program over and over with small changes and print statements each time.

matt2000 · on Sept 6, 2018

Yes, that's pretty much where I've gotten to as well. Might as well put in the effort to move that process into unit tests as then you get to run it repeatedly for free from then on.

TeMPOraL · on Sept 6, 2018

> so I use unit tests with a debugger in my IDE to run small portions of my new code until it works the way I envisioned it before I started writing it.

In languages like Lisp, this is what you use REPL for. It's an insanely more efficient way of working than the usual edit->recompile->run the whole app again.

However, couple of years working with Common Lisp and (recently) Clojure taught me that, even with a good REPL at your disposal, properly testing code as you write it can get pretty unwieldy - especially if the inputs are complex/large. I found it beneficial to write unit tests and call them from REPL instead of testing the code in REPL directly, as it's easier to maintain and develop individual test cases, as well as re-test all of them when I alter the code I'm working on.

(That's of course apart from the fact that unit tests are more permanent, and provide value later on.)

TL;DR: thumbs up for writing unit tests even when working in an interactive programming environment.

Sophistifunk · on Sept 6, 2018

REPLs are a great tool for exploring your ideas, and poking your software to test it "right now" but once you're happy with a small piece of code, nothing beats a suite of unit tests to help you refactor with confidence as your implementation gets a bit too hairy and you need to rejigger it in order to progress in a direction you didn't see coming.

nradov · on Sept 6, 2018

Even with Java now it's usually possible to modify a running application while you're debugging. Eclipse handles this pretty well.

jpitz · on Sept 6, 2018

You know, it's so much more vastly important to understand the expectation then it is to get the implementation correct the first time.

sneak · on Sept 6, 2018

Yes to the last! Whenever bughunting my first step is to reproduce the issue in a failing test. This has the benefit of me being sure I know what is failing and how, but also crystallizes then knowledge and experience into the repo for future engineers (or future me) to know the most common/practical ways things break in the real world. It is also marginally useful for catching regressions, if the test is clever/general enough.

evancox100 · on Sept 7, 2018

"If you wrote buggy software, why would the software you wrote to check that software be any less buggy?"

I work in chip design, where bugs can be rather costly, e.g. $1mm for a new mask set, not to mention the months it takes to get back new hw. The situation described above is why we try to have one person do the design and a different person do the verification/testing. A lot of the time, the test writer will treat the design as a black box, not even look at the code, and instead verify to an external specification. (There's also white box testing, where you try to target specifc areas of the design that might be especially problematic.)

This does protect against issues like misinterpreted requirements, invalid assumptions of valid input and/or operating scenarios. But I think it also ends up as a lot more work, to have two people get familiar with whatever it is you're designing. The best method is probably a combination of both.

ereyes01 · on Sept 7, 2018

When you have a well thought-out external specification, it really gives you lots of good material from which to derive all the right tests while still treating your implementation as a black box.

Unless you are implementing an established RFC or some other well-defined system, such a tight specification rarely exists in a software setting, and is very difficult to effectively design a priori. For this reason, techniques like test-driven development have arisen to try to make the programmer repeatedly and consistently assert their understanding of the specification via unit tests, while evolving those assumptions as needs and understandings change.

spc476 · on Sept 7, 2018

When I was doing QA at work, I never used the official tools to generate the data files required for testing the application, but wrote my own. This brought to light a ton of issues and unwritten assumptions (both in my code, and in the application).

3pt14159 · on Sept 6, 2018

I appreciated this article because I so infrequently get to hear from people that hold this view of testing, which I do not share.

To me, the importance of tests are a function of the consequence of failure and the likelihood that failure will happen. It's a question of hazard and risk.

One thing to realize is that code that lasts longer is more likely to fail because the people and libraries that support it are prone to change and assumptions can break. So risk goes up over time.

The other thing to realize is that code that's part of a growing product is going to impact more people if it fails. There might be only 400 users today, but 40k three years from now. So hazard goes up over time.

When I heard the technical debt arguments I try to triage them into "grows with the company" vs "doesn't grow with the company" so I can figure out what to accept as debt and what to pay. A complex deploy infrastructure doesn't scale. Just SSHing into a box and doing the deploy manually is just as fine for 100 users as it is for 1000. But tests do. So I write lots of tests; especially for ACL and I punt on the infrastructure stuff until I absolutely need to.

Note that this leaves aside the entire argument over whether tests slow stuff down. I think that on average they make refactors easier and features slower, so for the projects I'm on it's a wash. But even if I grant the point my conclusion wouldn't change. Tests are good.

ozim · on Sept 6, 2018

I think just SSHing to box and doing deploy is not function of users as you noticed but function of how many devs are working on project. For 1-2 devs manual deploy might be ok, when 3rd comes in, it is time to get rid of any magic that might happen when deploying by hand. Because you know someone will do something special and not document that, when he updates automated deploy he does not have to document or talk with other guys, it is documented in deploy tool.

__david__ · on Sept 7, 2018

It's also a function of how many servers you have. If you have to ssh into more than one box to deploy then you should probably have automation, even if you only have an ops team of 1 or 2.

3pt14159 · on Sept 6, 2018

I agree. Function of devs that have deploy power.

adrianN · on Sept 6, 2018

Without automated deploys, how do you know that what you deploy is actually what you tested?

3pt14159 · on Sept 6, 2018

Personally I have automated deploys, I was just reaching for an example that was available and relevant for a wide range of projects from inception. I think the thinking is that early on everything is on master so you just do the deploy by hand, run the migrations / tests, and bring the server back online.

rhizome · on Sept 6, 2018

Use tags?

marcus_holmes · on Sept 7, 2018

This is one of the many reasons why I have an allergy to dependencies.

I know other people's code is probably better than mine. But I understand mine. If something goes wrong (and something always goes wrong) I know how to fix it. When something goes wrong in someone else's code I either have to start working out how their code works, or report it and sit there like a lemon with a broken system until whoever wrote it has the time, energy and inclination to fix it.

Also, testing. I wish I was better at testing. At least I've got into the habit of writing tests to cover whatever bug I just discovered in production.

azeirah · on Sept 7, 2018

All I know is that if enough people rely on the same dependency for long enough, the chance of encountering large bugs becomes smaller and smaller. Especially if the dependency has a stable interface.

Good software gets better the more it gets used and abused, so I tend to stay away from small dependencies, they're usually not worth the time.

noelwelsh · on Sept 6, 2018

How could such a tiny change have such an outsized impact on the site? “That same story happened so many different times,” my old boss David told me. “Someone launched a small, relatively innocuous change that did one of the millions of unexpected things it could have done, which then happened to break some part of the site, and then bring it all down—sometimes bring it down to the point where we couldn’t recover it for hours.”

Preventing this is one of the basic motivations of functional programming. If a function is "referentially transparent" or "pure" it can't do anything unexpected. It seems to me this is the only reasonable way we can reason about software at scale.

dxhdr · on Sept 6, 2018

> If a function is "referentially transparent" or "pure" it can't do anything unexpected.

Sure it can; at a minimum all functions consume compute resources. A tiny change in a "pure" JavaScript function could lead to cascading deoptimizations that destroys performance and compromises service availability. For example, changing 0.0 to 0 could do it.

adrianN · on Sept 6, 2018

A slight change in performance characteristics can also lead to cascading failures. Pure functions don't help with that. That aside, even functional programs need to have side effects (other than heating up the CPU). Referential transparency doesn't help when the state of your monad is a little bit weird and that causes some corner of your code to kill the DB.

freekh · on Sept 6, 2018

Anecdoctally this happened to me the other day in my Haskell backend: had a recursive function (in the IO monad) that every ten second would take a connection of my connection pool (and since it was recursive) would not give it back to the pool. Of course it was a stupid bug, but it was hard to figure out because it would take a while before it got to that point and once it was there every call to the backend would block then timeout. I like functional programming, and yes: it was in the IO monad so not pure, but I think most applications ends up in some stateful monad even if your stack is built for functional stuff... This was 'not at scale' but it as a Haskell project it is fairly big. Point being: yes, pure apps will not havr this predicament, but even with a stack built for pureness I can see it happen quite often. Running software is messy...

Jeff_Brown · on Sept 7, 2018

This makes another argument for Haskell, because the language allows you to isolate the non-pure code into a thin top layer. That makes debugging much easier than it would be if the complexity of IO extended all the way down through your app's subfunctions.

jolmg · on Sept 6, 2018

While I generally agree that functional programming does help immensely to make secure software, in this case, the function wasn't pure. It was meant to return data that didn't exist from a database, but the error was silenced. Also, the reason why the servers were "burning" was because this somehow resulted in a memory leak that consumed all the memory available to the system, which is why they also had trouble pushing fixes. Even in a pure function in Haskell, memory leaks are possible and the same scenario could've resulted.

mreome · on Sept 6, 2018

While I can understand some of the benefits of functional programming within a single sub-system, or within academic research, I struggle to understand how pure functional programming can address the need for the large amounts of state information and abstraction layers required for a large scale system. I work with systems where (large/numerous) sub components require very specialized domain knowledge to understand internally, so multiple levels of black-box abstraction are required to integrate/interact with the sub components. Additionally, both the client systems and the sub components may have huge amounts of persistent state information.

Could you suggest any resources that might help me understand how this can be accomplished in a pure functional paradigm? Something that assumes only a casual understanding of functional programming would be especially helpful.

scns · on Sept 6, 2018

Maybe the blogpost "Functional architecture is ports and adapters" by Mark Seemann may be of help to you. "fsharp for fun and profit" by Scott Wlashin is really approachable. Cant post links since on mobile.

mreome · on Sept 7, 2018

Thanks, I will take look at both of those.

auganov · on Sept 6, 2018

Before you spend some time doing mostly functional it's easy to ignore just how much meaningless state you might have floating around. You're right that any non-trivial endeavour will necessitate quite a lot of information being kept around. But in my experience most of the state you produce in languages that don't support functional patterns isn't even related to the domain. You're often forced to put thing into variables, objects and all sorts of things that might not make a lot of sense.

zbentley · on Sept 7, 2018

The original article wasn't talking about issues caused by storing too much program state in in-memory structures (variables/objects) though. It was talking about issues introduced by interactions with an external stateful system--a database.

Bugs related to in-memory state that gets out of whack are definitely a hassle--especially in multithreaded situations--but are, in my experience, only the tip of the iceberg. Once you have that kind of issue under control (either by choice of platform, discipline, linting, or unit testing) there is a huge category of problems that can arise from the remaining surface, categorized loosely as "as soon as your code leaves its own memory space, it's part of a distributed system, with all the hassles that entails". Proclaiming that FP techniques resolve or even significantly ease that category of issue seems debatable at best, and a false promise at worst.

TheDong · on Sept 6, 2018

> Do you have anything that explains this complex topic?

> Something that assumes I don't know anything about the pre-requisite topic would be helpful

Unfortunately the only way to gain a deep understanding of the powerful tools functional languages do give you is to gain a deep understanding of functional paradigms.

Me saying "Monads can help with state abstractions" doesn't help you.

The fact that large non-trivial applications have been written in haskell should be enough evidence that this is possible.

The fact that the majority of people involved in such endeavors have claimed that it made their code safer and easier to refactor and had relatively few bugs should be enough evidence that it's a good idea.

The fact that those who decry it more often than not do not know the subject should give their criticisms no weight.

I don't see why you need any more evidence that you should simply learn functional programming so that you may first-hand answer your own question.

kbenson · on Sept 6, 2018

> Unfortunately the only way to gain a deep understanding of the powerful tools functional languages do give you is to gain a deep understanding of functional paradigms.

> I don't see why you need any more evidence that you should simply learn functional programming so that you may first-hand answer your own question.

It is possible to interpret the GP comment as a request for good resources to learn FP. You seem like you could probably point one or two out. Since they've already expressed a desire to understand, that might be more beneficial for them than the equivalent of "just do it".

mreome · on Sept 7, 2018

Thank you. Basically I was looking for the functional-programming equivalent to the OOP factory-equipment analogy, or asking if such an analogy does/can exist (if it can't, the why might also be a valuable explanation). That analogy can be used to explain the main ideas behind encapsulation, interfaces, and internal state, and why they are valuable, before having any understanding of how those are implemented in practice, and even without any programming knowledge at all. So far everything I've found on functional programming starts with implementation details and provides no initial framework for understanding the broad-strokes of organisation and value proposition.

lavayya · on Sept 8, 2018

You might like Rich Hickey's talks, especially "The value of values", "The language of the System", and the 2018 conj key note. The Talk in transducers also has a nice analogy: the conveyor belt.

mreome · on Sept 7, 2018

Unfortunately the only way to gain a deep understanding of the powerful tools functional languages do give you is to gain a deep understanding of functional paradigms.

If I didn't know what 'functional' was in this context, I'd be entirely convinced you were tying to convert me to your religion with that statement. This attitude from the functional-programming community is the main reason I have avoided learning more. I was asking a legitimate question and instead of providing something useful you decided to assume I don't know anything about the pre-requisite topic and spout verbose dogma at me that does nothing but attempt, and fail, to make you appear intelligent.

bunderbunder · on Sept 6, 2018

By definition, pure functions can't read from or write to disk, can't read from or write to the database, and can't read from or write to a socket.

If you aren't able to do any of those things, there is no "at scale" - you're trapped in a single process on a single computer, spinning away but not able to communicate with anyone else.

Pure functional programming helps within an individual component by letting you quarantine the statefulness at defined areas (e.g., in Haskell, at the program's entry point), but it doesn't let you make it disappear. Sooner or later, you're gonna have to deal with it.

At scale, you've got tens or hundreds or thousands of different processes, and each of them has its own statefulness that it ultimately has to deal with. When you're looking at the big picture like that, I'm not sure it makes sense to worry about whether it's been quarantined, Haskell-style, or not. It's still there, and, in the big picture, it's still distributed more-or-less homogeneously throughout the system. Even a process that doesn't use disk or talk to the DB or anything like that is still not referentially transparent, because it's still subject to side effects by virtue of the network being unreliable, the OOM killer being unpredictable, and all that fun stuff.

Jeff_Brown · on Sept 7, 2018

> I'm not sure it makes sense to worry about whether > it's been quarantined, Haskell-style, or not. > It's still there

True, but the quarantine makes the bug much easier to find.

JackFr · on Sept 6, 2018

Wow.

1. We don't test. 2. We don't code review (or rather if we do, we do it so poorly swallowed exceptions don't raise red flags.)

That's an outrageously unprofessional software process.

merlincorey · on Sept 6, 2018

I washed out of their interview process when I was fresh out of high school for not knowing bit twiddling hacks[1] in the technical phone screen with one of their engineers.

After reading this, I can't help but feel like I dodged a bullet there.

[1] https://graphics.stanford.edu/~seander/bithacks.html

gwbas1c · on Sept 7, 2018

Don't feel bad: I cut the phone screen short about 25 minutes into it. I just got a really bad vibe about all of the front-end complexity they were building into a dating site.

andrenotgiant · on Sept 6, 2018

An outrageously unprofessional software process that created a product worth $50M!

https://techcrunch.com/2011/02/02/match-com-acquires-online-...

JackFr · on Sept 6, 2018

Created a product worth $50M because of their software process or despite there software process.

mabbo · on Sept 6, 2018

That's not skill, that's luck. If the bug in question had instead leaked every customer's personal data there wouldn't be a $50m company, there'd be a multi-million dollar lawsuit.

TeMPOraL · on Sept 6, 2018

I agree with your thinking, but with this particular example: you'd wish. See: every other data leak that happened over the last decade. It seems data leaks are harmless to companies.

laurent123456 · on Sept 7, 2018

I was going to say "not for a dating website", but I see that Ashley Madison still exists despite the massive data leak in 2015 including data from users who had paid to have their data deleted, so yes looks like there's no real fallback for data leaks.

danesparza · on Sept 6, 2018

The OP is about technical process in the software world. I'm not sure how a company's valuation comes into the conversation...

blacksmith_tb · on Sept 6, 2018

I assume the implication is "if OKC can fly by the seat of their pants and still make millions, anyone could!" Which is true, but still not advisable...

T2_t2 · on Sept 6, 2018

I think the implication is that they achieved the goal they had, to grow a company. Note that goal doesn't mention code at all.

It depends what you think software engineering is - a means to an end, e.g. to grow a company, or as a craft you want to perfect and do for your whole career.

I think there are more craft coders, because most jobs are at companies that have survived and there is more to lose from errors than gain from moving fast. The craft attitude may not be ideal when you want to go from zero to something, and avoid going back to zero very quickly. Or, perhaps more importantly, if you want to go from zero -> something wrong -> something wrong again -> something almost correct -> Product Market fit.

stcredzero · on Sept 6, 2018

Given that technical debt is quite hard for non-technical people to quantify, why wouldn't we expect that most startups would tend to accrue technical debt?

theseatoms · on Sept 6, 2018

Can anyone quantify technical debt? Best case is something resembling a medical diagnosis.

stcredzero · on Sept 6, 2018

I think that could be enough. Actuaries have a good idea about when you are likely to die in the next two years. Basically, when you are old and your medical costs suddenly double. Doctors use the same sort of inductive reasoning for their prognosis. The industry probably collectively has all the information to put together useful actuarial tables on technical debt, but it's all silo'd in individual companies.

theseatoms · on Sept 6, 2018

Interesting. Are there IT consulting firms that specialize in this field of study?

adrianN · on Sept 6, 2018

Startups accrue so much financial debt, adding some technical debt on top doesn't really make a difference.

stcredzero · on Sept 6, 2018

Technical debt doesn't accrue in a linear fashion. Instead, it's synergistic. Up to a certain point, it's a noticeable drag. Past that point, there is a change in the difficulty incurred.

In the article, the combination of some "you're-supposed-to-just-not-do-that" with:

>If (the database throws an error) { do nothing }

resulted in a major mishap.

jschwartzi · on Sept 6, 2018

The point is that engineering decisions need to involve both the business and the technical side of things, and they need to be pragmatic and not tied to any particular dogma from either world.

swalsh · on Sept 6, 2018

It's a free dating service. They have a responsibility to respect their users' data privacy, but the rest kind of doesn't matter. Eventual stability is more profitable than eventual release if you're not a Bank or making self driving cars.

skybrian · on Sept 6, 2018

Yeah, I expect the devs at Ashley Madison told themselves that too.

It's possible to build sandboxes where security or privacy bugs are contained, but the shops that don't care much about software quality are probably not doing that.

It seems a little too easy for people building a social app to tell themselves they're just building a toy that doesn't matter. Their users might disagree after they get hacked.

KingMob · on Sept 6, 2018

I'm sure the OkCupid folk are nice people but it was:

1) founded by mathematicians and an academic, and 2) written in C++, with a custom templating language

Even giving them the benefit of the doubt that the core matching algorithm warranted C++, the whole site really shouldn't have been built that way.

I'm really not surprised there's poor technical decision-making there.

tluyben2 · on Sept 6, 2018

In my experience this ‘outrageously unprofessional‘ process is very normal. Most companies are not software companies and they are not going to pay for whatever is not adding to the bottomline. And even in software companies that ofcourse happens. It is unfortunate but the shock and awe are a bit over the top; when I talk with partners (ctos is large corps; banks, insurance, retail), they generally glaze over and mumble about their dreams of doing tests and code reviews but unfortunately it is not possible ‘at the moment’.

megaman8 · on Sept 6, 2018

You have to judge it in context. If it's entirely a senior team then the value of code reviews is greatly diminished. You rarely catch a bug in a bug review when you're reviewing a Senior software engineer's code. those bugs aren't caught until you do testing.

nradov · on Sept 6, 2018

I've seen plenty of defects in senior developers' code that were caught in reviews. Please the purpose of code reviews isn't just to catch defects, it's also to disseminate knowledge of that module across the team in order to reduce the "bus factor".

ddebernardy · on Sept 6, 2018

Outrageously unprofessional as it may be, they probably ended up laughing on their way to the bank when they got purchased by match.com.

toast0 · on Sept 6, 2018

> That's an outrageously unprofessional software process.

People who call me unprofessional forget that being a professional just means I'm paid to do it, it's not a pass time.

chasd00 · on Sept 6, 2018

it worked well enough until they could do something better. Never let perfect be the enemy of good.

danesparza · on Sept 6, 2018

I was immediately struck by the same set of thoughts. I don't know whether to be appalled or just ... sad.

stcredzero · on Sept 6, 2018

Yet what I found even more troubling was that in order to write effective tests, a programmer had to know all of the ways that a piece of software could fail in order to write tests for those cases.

So the programmer needs to think about all the ways the code can fail. Writing the tests concretely documents that thinking in the source code repository and automates re-checking that thinking. A really good programmer then tries to organize the code, such that it's easy to think about how any given unit of code can fail. Writing unit tests rewards such organization and penalizes bad organization.

That said, TDD doesn't seem to work out that way in practice, most of the time.

hnruss · on Sept 6, 2018

After reading the introduction, I thought that the author was going to use a story of software failure as an example of why you should write unit tests, or at least why you shouldn't deploy untested software.

However, the moral of the story was essentially: "software is so complicated that it is bound it break, so you have to be good at fixing it". While that is certainly true, I think that developers have a responsibility to use whatever tools they can to write high-quality software.

jschwartzi · on Sept 6, 2018

> However, the moral of the story was essentially: "software is so complicated that it is bound it break, so you have to be good at fixing it". While that is certainly true, I think that developers have a responsibility to use whatever tools they can to write high-quality software.

I actually disagree that we have a responsibility to write high-quality software. As engineers our job is to create software that is good enough for the task at hand and that can be improved later.

Sometimes that means that you invest in 100% coverage plus integration and system-level testing because "good enough" means that it has to never fail. Sometimes that means you don't spend much time automating tests and instead do informal development testing before you check something in.

It's entirely up to your judgement based on what your code is going to be used for, and I don't think it was inappropriate for OkCupid's engineering staff to avoid testing in this case. After all, their business was really successful through the period where the error happened.

jhayward · on Sept 6, 2018

> As engineers our job is to create software that is good enough for the task at hand

This is the essence of engineering. As the saying goes, "Any fool can build a bridge that won't fall down. It takes an engineer to build a bridge that just barely won't fall down."

In other words an engineer's contribution is not perfection, it is being able to know the difference between "good enough" and "needs more work" and to apply that knowledge to meet time, cost, and material constraints.

user5994461 · on Sept 6, 2018

Every bridge will fall down, it's just a matter of time.

pjc50 · on Sept 6, 2018

> responsibility to write high-quality software

Since GDPR, there's now an effective legal minimum of quality; you need to protect user data and keep it accurate.

twic · on Sept 6, 2018

That moral seems related to a meme floating around that Mean Time To Recovery is more important than Mean Time Between Failures. I think it came from, or through, John Allspaw:

https://www.kitchensoap.com/2010/11/07/mttr-mtbf-for-most-ty...

It's popular with a lot of people i know, because it's one of those ideas that's counterintuitive but believable, and so makes you feel really smart for knowing it.

drblast · on Sept 6, 2018

The reactions to the outage on Twitter are hilarious and make me glad this happened:

“@okcupid how am I supposed to get my daily dose of crushing rejection and emotional humiliation if your site is down????”

“Okcupid stops working over lunch hour, NYC wonders if we're committed to the people in our phones from now on, panic in the streets”

“@okcupid How can I continue to be ignored by the females of the world if they don't know where I am to ignore me?! #panic #freakout”

ahallock · on Sept 6, 2018

> ...in order to write effective tests, a programmer had to know all of the ways that a piece of software could fail in order to write tests for those cases

That's impossible. You reason about the problem as best as you can and create tests appropriately; and then if there are edge cases/failures in production, you go back and add those to the test suite.

gonzo · on Sept 7, 2018

> If she forgot the square root of -1 was undefined, she’d never write a test for it.

Sigh. sqrt(-1) is not “undefined”.

Jeff_Brown · on Sept 6, 2018

Static typing helps.

Higher order programming helps.

These are two reasons I love Haskell. The compiler does a lot of work of understanding your code for you, and makes it easy to query that metadata interactively, and refer to it from other parts of static code.

Haskell is of course not the only statically typed language, but its type system encompasses more, and more uniformly, than any other I am aware of. A lot of things that used to have to be part of the comments can now be first-class citizens.

mirimir · on Sept 7, 2018

I'm reminded of the day that I discovered SQL cross joins. Some decades ago, working in Access on WinXP. So I executed, and nothing much happened. So I went for lunch, and when I got back, the machine was frozen. So I went WTF?, and hard reset. And then did it again, a few times.

And the cool thing was that doing that, and running out of disk space, apparently didn't damage the system. Microsoft must have designed WinXP to tolerate that.

heisnotanalien · on Sept 7, 2018

You write tests not only to hopefully catch bugs but so that you can change the software reliably significantly in future and not lose forward momentum.

ww520 · on Sept 7, 2018

Yep. I had caused a similar problem in the past that brought down a live site. It's a cascade failure on error handling that caused an avalanche of retrying requests piling up that eventually more and more servers failed under the load. Not fun. Luckily we had well defined deployment and rollback procedures and was able to roll back the change easily.

stephengillie · on Sept 6, 2018

> “Users don’t care.”

This is the counter to every technical bikeshedding post. Either it creates revenue by being what users want, or it's a waste of time. And people seem to love wasting time instead of delivering.

TeMPOraL · on Sept 6, 2018

Programming would be trivial if things were so simple.

The reality is, that "either it creates revenue by being what users want", or it enables creating more of that later on, or prevents the speed of creating more of that from dropping, or prevents the whole thing from catastrophic errors (like in this article), ...

Those are the trade-offs engineers have to consider daily.

nandemo · on Sept 7, 2018

In a surprising turn of events, the user showed they did care.

> “@okcupid how am I supposed to get my daily dose of crushing rejection and emotional humiliation if your site is down????”

08-15 · on Sept 7, 2018

Is this really about testing?

> If (the database throws an error) { do nothing }

It's paraphrased, the original was probably the C++ anti pattern of

> try { /some code/ } catch {...} {}

Whoever wrote this should be disciplined. This kind of code doesn't do anything, it just makes a bad situation worse. (It's also never tested, because who writes tests that provoke failures?) This kind of idiocy has no technical solution, unless the two-by-four counts.

If you're ever tempted to write something like this... don't. Especially in C++, the correct approach is to do nothing. Someone else will handle the exception. Code that isn't written doesn't cause trouble and doesn't need to be tested either.

franzwong · on Sept 8, 2018

Sometimes I want to see how a piece of code work, I can just run the unit test. Otherwise, I may need to start the server, prepare some unrelated data, send a request to the server.

xupybd · on Sept 7, 2018

A very well written piece that sums up the state of play without any sense of panic. That’s pretty rare for this sort of article.

planxty · on Sept 7, 2018

Test your code, or stay the hell away from me. :)

TheCapn · on Sept 6, 2018

(Off Topic -- Rant regarding Webpage)

Can I take a short moment to bitch about "sidebar" frame or whatever the hell this domain is using? The webpage looks like a fluid single frame but the left side is static. The browser page has a scrollbar but I need to have my mouse over the correct div before my scroll wheel will move the page, its a bit aggravating since there's no visual break between the "sidebar" and the content.

pwg · on Sept 7, 2018

I just turned off CSS in Firefox (View menu -> Page Style -> No Style) and the fixed sidebar was no more and the odd scroll bar behavior was gone.

pritambaral · on Sept 7, 2018

I used Firefox's Reader Mode.