Hacker News new | past | comments | ask | show | jobs | submit login
Kent Beck: “I get paid for code that works, not for tests” (2013) (istacee.wordpress.com)
550 points by fagnerbrack on Dec 8, 2016 | hide | past | favorite | 354 comments



But we don't write tests to check if our code works. We write tests to be able to change it in the future with certain degree of confidence that we don't break anything and - if so - what exactly.

There are other techniques which can give similar confidence, but tests are the easiest one.


I agree with your assessment that the primary reason we write unit tests is to be able to quickly make changes in the future without fear of breaking something: You do some extra work today so that you can save a tiny bit of work (and worry) tomorrow, and over time that accumulated future benefit outweighs the penalty paid today.

However, many in the industry forget that this is the underlying reasoning, as seen just two days ago here on this site. Read through the top-rated comments on this post: https://news.ycombinator.com/item?id=13119138

They describe an emergency situation where a single "3" needed to be changed to "4" ASAP or people would lose their jobs, and everyone's applauding the gatekeepers who insisted on significant refactoring and the creation of additional tests before the change could be approved.

I agree with those who say those improvements should maybe have been demanded immediately after the fire was out, but those who would have delayed the firefighting out of blind allegiance to the rules seemed, to me, to have forgotten that the rules are there to serve the programmers (particularly, their ability to quickly ship working code), and not the other way around.

A rule that's failing to do that should be changed or ignored.


I wonder what would have happened if he had initially told the CEO it would take 6 days for the change?

"We can do that, but it will take us 6 days, otherwise we risk taking the plant down and aggravating the issue."

I wonder if the CEO would have just said ok thanks.

In my experience that's the case. The engineer in that link got himself in a bad spot, because he didn't know what was involved for the change when he communicated his estimate. And most of his back and forth that slowed him down would have been avoided had he known beforehand how to properly do it. Even with everyone's feedback it sounds like a 1 day code change. That seems to me like the reason the change was slow is more ramp up time for him working on a code base he doesn't normally work on.


I agree that the best possible answer to management would have been, "It'll take six days by the book, or if you approve some emergency rulebreaking and the issuance of technical debt, I can have it done in an hour. And then it'll take six days to clean up."

Then the boss can make an informed decision.


Yes, this. Provide options to management, with risk analysis, so they can make a decision. Management's priorities are not always the same as development's priorities.

The important thing here is to provide the risk-assessed alternative in writing. This covers the asses of both sides! If something blows up and management was not warned about the possibility, they're well within their rights to rake the engineers over the coals for it. But if engineers warn management of the potential consequences and management chooses to take the risk, if it blows up, you have your CYA right there - they were warned in writing.


This works if you trust your manager to take responsibility for the decision, which you often (usually?) can't.

If you can't trust your manager to take responsibility for the decision, then it's better to make a decision you're going to be held accountable for than to let your manager make the decision and then hold you accountable.

I've seen other engineers in the position where they give a risk analysis and warning in writing, and then shit hits the fan and they get fired. Maybe it doesn't happen immediately, maybe the reason they were fired isn't explicit, but the change in the manager's attitude toward the engineer traces back to when they did what the manager said.

There are also mangers who won't follow through on the tech debt part because they don't trust their engineers even if their engineers are trustworthy. When they discover that they can bypass testing by pulling the emergency lever, they'll start pulling it all the time because they see it as a way to get what those lazy engineers to do their jobs faster. And when tech debt catches up with you, bugs abound, and development slows to a crawl, the engineers get blamed.

Maybe you have a boss you can trust to take responsibility for their risky decisions. Maybe your boss trusts you when you say that paying down tech debt is necessary. But maybe your boss and their boss don't have the same relationship, and the shit rolls down hill.

Yes, I want to work in a trusting environment where my interests are aligned with doing what's best for my company, but an at-will employment capitalist economy doesn't always work that way. It's every man for himself at a fundamental level, and exceptions to that are too rare to make a blanket claim that people should just do what's best for their company.


If you can't trust your boss to not scapegoat you when you provide a risk analysis in writing, you need another job.


Eh, it's not all that big of a problem of you know how to deal with it (protect your own interests, not the company's, by making conservative choices as I described above).

I work 35 hours a week, make $125k/year, have good benefits, like most of my coworkers, and am doing relatively interesting work for a company that isn't completely evil. Sure, my boss can scapegoat me if I let him make technical decisions, but I don't. That's a tradeoff I'm okay with. I'm always on the lookout for something better but I'm happy where I am.

And besides, most people who think their bosses won't scapegoat them if something goes wrong are naive. If it's your job or theirs and they have the power, you're screwed. It's better to avoid the risk and not put yourself in that situation.

To be clear, I'm not saying don't take risks. I'm saying wroth risks on your own behalf.


[flagged]


I'm not proposing you shouldn't work hard, and I don't work for a large enterprise. I'm unsure how you got that from what I said.


I agree with you. I think most often then not, you need to withhold the inferior alternative, or be very careful in how you presented it. If you downplay the risks, or you don't emphasize them enough one way or another, it will still end up being your fault, even in writing.

Trade offs that are business related can be delegated and should be delegated to the business to make. But saying something like: "if we do A, it could make us vulnerable to X,Y,Z technical problems, but would allow you to have what your business needs 7 days early" can be dangerous, because it assumes the business stakeholders truly understands the dangers of technical issues X,Y and Z, which they almost always do not.

As an engineer, you should advise the business towards the proper way to do things, and you are responsible to make sure your advice is clear and loud. It is sometimes appropriate to suggest the less ideal scenario, but if you do, be ready to assume full responsibility and ownership on it.

You should never have an advice that goes like: "Yes we can do it, but it'll cause other problems." and expect for these other problems to not be your responsibility to fix and mitigate as they appear. The implicit of all your suggestions is always: "I can make this work." So better be sure whatever you suggest you can actually have it work.


"Yes, I want to work in a trusting environment where my interests are aligned with doing what's best for my company, but an at-will employment capitalist economy doesn't always work that way."

What economy does work that way?


Employee-owned cooperatives sidestep this issue pretty well by empowering workers. I don't know of any entire economies, though.


It's important not to understate the fact that both options are a false choice. you can have it the right way Plan A, or the right way Plan B. There was no offer of a Plan C where we hack stuff and then just hope for the best.

Plan C is what the Strategy guys always want to go for, because they can't recognize when the repucussions happen. They just tell themselves it's those good for nothing engineers screwing up again.

But too often there's been one guy ithe engineering group who values accolades over stability and will offer to be a hero, only to wander off without finishing anything. Later in life I've realized I should have suffered through more of the issues at companies where the team at least had solidarity. I knew I wanted that in a team, but didn't know that I needed it.


That's why you don't offer a Plan C. There's a fast-but-dangerous Plan B, with a risk analysis. You've warned them what can happen, in writing. Now, if something goes wrong that you didn't warn about, there could be other repercussions, but that's still better than not warning them at all, which easily makes you the "But no one told me this could happen!" scapegoat.

Plan C doesn't differ from Plan B in terms of technical approach. It differs in terms of the consequences of failure - not the technical consequences, but the political consequences.

My first really interesting project that led me down the sort of DevOps/Agile trail was in the mid-1990s, when I helped write a risk management and contingency planning process for the mid-sized company where I worked. One of the features of the process was that management could not reject a risk analysis on any grounds except incompleteness. They couldn't say "I don't want that written down!" They had to sign off on it, in writing.

This turned out to be very popular with both teams and management. Teams felt they were finally getting the opportunity to cover their asses, and management finally felt like they were getting straight answers from the teams about actual risks. In the earlier lack of process, teams were at least passively discouraged from honest, written risk analysis, as naysayers who were resisting business opportunities. Worse, if a warning was given and then the problem happened, the people who warned were blamed for not preventing the problem!

When we beta-tested the analysis process, the first project that tried it actually turned down business because it was too risky. That had never happened before. We probably saved the company many millions.


Manager agrees, rulebreaking then fix later.

Later...never arrived for tomorrow is another Very Important Thing that must be done now!

Somehow management continues to forget the critical resource: time to do things right


Yeah, but the problem is that managers are not engineers, they want "results" and do not have sufficient software engineering knowledge to be aware of the effects of making this type of decisions over and over again.

Eventually the software engineering department has a code base that is riddled with tech debt, to a point where changing a 3 into a 4 ACTUALLY takes 6 days.

Then management asks "WTF? What did you do? Why does it take 6 days to turn a 3 into a 4?", next comes frustration and SWEs leave the company.

I've seen this happen at 3 different companies in 3 different industries, a huge company (hundreds of devs), a medium company (50 devs), and a small company (2 devs) [1].

Every time it happened, it was because a SWE team was constantly delivering under pressure by management that disregarded the cost of writing code "that works" without ever addressing tech-debt, no matter how much the SWEs warned management.

[1] To be fair the small company only fell into that anti-pattern because neither the engineers nor the managers knew any better.


Perhaps the reason you've seen this phenomenon everywhere is that the companies that don't do it tend to fail, whereas the companies that do tend to succeed and grow and hire people like you.


This was one of the most important lessons I learned at my first job out of university. Beautiful, well structured, clear, simple code is nice, but the customer doesn't give a shit about it. All the customer cares about is that your product solves their problem.

By all means, if you have additional resources, invest them in refactoring and improvements and additional testing and continuous integration and all those things that make our days enjoyable and our product quality high. But your first priority has to be making sure that your product solves your customer's problem.


I disagree with this. Management doesn't need to micro-manage everything, that's certainly not their job role. This was a simple change in an internal system. Yes, maybe this minor super isolated change could have broken something in some parallel world, but the stuff it would have broken is not an airplane computer or a heart machine. Bloody hell, sometimes it is important to put things into context. Engineers get paid a hell lot of money and you would expect that someone with that qualification and level of experience would be capable to make smart decisions independently without being micro-managed by a manager all the time. After all the engineer is the real expert of his/her code and can make a much better assessment of the impact of the change than the manager, so freaking do it! That's what you get paid for! Seriously, there was no excuse for this to take 6 days.


Increasing the backlog by a month is not so urgent that 6 days is a problem. It's useful to have a process in place for more urgent hotfixes in production, but I don't see why that would be needed for this particular case. And rushing changes for a mission critical system can be very dangerous.



More accurately: by the parent of that comment.


This handshake adds yet another day to the process though.


If you read closely, it was only 6 days because the CEO did some rulebreaking at the end. I imagine it would have been several weeks otherwise.

At some point organisations forget that the process is there to serve us, and not us to serve the process. Moving variables to parameter files, renaming legacy variables, these all seem like much more risky things than simply changing the value of the variable.


> I agree with those who say those improvements should maybe have been demanded immediately after the fire was out, but those who would have delayed the firefighting out of blind allegiance to the rules seemed, to me, to have forgotten that the rules are there to serve the programmers (particularly, their ability to quickly ship working code), and not the other way around.

I think it depends on the type of product you are working on. Certain domains require very strict adherence to policy - for very good reason. Just because your shiny new aircraft's deadline is tomorrow, doesn't mean you can skimp on the required testing.


I think the linked thread is good example, changing 3 to 4 can seems minor, but i think even with rules in place, it is not a burning 'fire', it should be easily acceptable to run this with rules enforced in a day or two. Letting people to override rules is more damage in long run for sure. Maybe they would override next time when they shouldnt


Sorry, but you don't get to make that call; it's the CEO's job, and the CEO said it needed to be done immediately, not in "a day or two".


I wholeheartedly disagree. In fact, I believe that's the difference between an engineer and a programmer.

If you are hired as a programmer, then yes, just do whatever we ask of you.

But if you are hired as an engineer, everything the business asks of you comes with an implicit: "and make sure it's done in a proper way that won't break anything, or slow us down, or cost us too much, or limit our ability to gain a competitive edge."

You don't just change a 3 to a 4 because the CEO wants you to. You have to make sure the change doesn't come with unforseen impact that would put the company at risk, and you have to make the change in a similar way. That's what the CEO expects also. If you did the change, and it had caused impact to the business, that you had not pointed out, and for which the business believe is more harmful then having waited a few more days, you and only you are to blame, and you will be. You can't say, but CEO told me, you're the engineer, you're the person they hired to know this stuff and prevent these issues from happening, not the CEO.


The point is that decisions should be made by the people who have the proper information to make them.

If the CEO lacks the understanding of the technical consequences of a change that may blow up the company, the engineer should make the decision. If the engineer lacks understanding of the business consequences of not making the change - like losing an important client, or suffering a wave of negative PR, or facing a lawsuit - then the CEO should make it. Ideally, both sides should be communicating these consequences so that both of them have all the relevant information and would ideally make the same decision. Then the decisions can get made at the lowest level that has all this information, and the CEO doesn't have to get involved.

In practice, there are many cases where the CEO can't communicate all of the relevant business realities, eg. if you're facing a lawsuit if you don't make a change, it's often better not to worry the rest of the staff or make them subject to depositions, and simply to ensure that the change gets made. That's why the CEO is the decider by default in organizations, and also why it's usually expected that employees will obey a direct order from the CEO or be fired.


I'm not sure we're actually in disagreement. If Captain Kirk tells Scotty, "We need warp 9 in five minutes or we lose the ship" and Scotty cuts corners that cause a decompression explosion that kills twenty redshirts, then yes, that would probably be an example of bad engineering management.

But if Scotty delivers on time at the cost of overloading an expensive piece of equipment that, after the battle is won, requires a week in drydock to replace, that's probably a successful execution of exactly the kind of call a senior engineering manager is expected to make.


If Scotty didn't confirm that it's important enough to risk serious damage, he didn't do his job. His job is not to blindly trust the captains omniscience, his job is to inform the captain of the technical consequences. If warp 9 in 15 minutes is preferable to taking damage, than that's the better option.

In the example with the line of code that took 6 days, there was no dramatic emergency in production that required cutting corners. If it had been an emergency, of course the code refactoring demanded in code review should have been postponed; those changes increased the impact of the change, and therefore the resting requirements.

And if it really is an emergence that requires people to drop what they're doing, then someone with sufficient authority should be directly involved in order to override all the usual procedures.

But you don't just drop all procedures just because somebody claims somebody said something. That would be dangerously irresponsible.


Could be we don't, most disagreement is miscommunication.

In your example, Scotty knew what he was doing though. He didn't say, wow, what Kirk wants me to do could kill twenty redshirts in the process, I'll just take the gamble since he seems to want me to. He knew exactly the impact, and made it knowing he would easily be able to contain it.

Which is often not the case in Software and in practice. You have to do something to know the impact, because most problem we solve is always new. Its not something we did many times before. If that variable was often changed, then it would be completely different, because he'd known, just like Scotty, that its something they can do. In that case you can make the choice to say, lets change it, and later handle the tech dept of the less maintainable code.

Also, in software, its almost never the case that people can't wait a few more days.


It always amazed me how little Star Trek used robots, even just remote arms, to do things. (Yeah, it's a side-effect of it being made for TV)


Trek wasn't made for logical consistency. They could duplicate Riker with a transporter accident, but they couldn't make another Data that way, they had an entire episode about deconstructing him, then another about him constructing a 'child' robot.


This seperation of programmers to mindless programmers and somehow superior engineers is bullshit. You have to follow the best practices in whatever you do.


Some people know how to program. Other people dedicated years to the study of computer architectures, first order logic, distributed systems, fault tolerant systems, etc ...

In my opinion, computer science and engineering is not about just making the code you're told, it's about questioning whether that code needs to be done in the first place, and if so, how.


Most software development is not done, or meant to be done as an engineering discipline. It is far more often a craft. That said, it depends on the project, environment, company and legal requirements.


For that, hire a programmer. Need a dependable architecture for mission critical software? You get the point ...


Define mission critical? Can be down for a 15 minute update once a week? Most have transparent updates? Should be up most of the time? Must be up durring East coast business hours?

It still depends.


> If you are hired as a programmer, then yes, just do whatever we ask of you. But if you are hired as an engineer, everything the business asks of you comes with an implicit: "and make sure it's done in a proper way that won't break anything, or slow us down, or cost us too much, or limit our ability to gain a competitive edge."

This is a distinction that is a very thin line and most people with "engineer" in their title would not sign up for.


True, most engineers, and others, will commit crimes including criminal negligence, when asked firmly enough by a powerful corporation that can fire them. See Flint, Michigan, the great robbery of 2008 etc, etc. But the idea of a profession is that you don't, that you have professional ethics.


Yeah the term engineer is thrown around pretty casually in the software world. You can apprentice and take the PE (Professional Engineer) test, but it's often hard to find software engineers with their PE certificate. It's much more common in other fields like Mechanical, Electrical, and primarily Civil.

It all my years (15) of professional experience, I've only worked under one PE, an Electrical Engineer.

In some states you literally can not have "engineer" in your job title unless you have a PE certificate/accreditation/whatever.


Exactly. As an engineer, no matter what your boss says, you are responsible for the consequences of your actions. If the CEO tells you to do something which you know is unsafe, you must decline (and explain why, preferably).


Having read that article, I do agree - but the requested change, while being "one line", could have unforeseen consequences that could also have a very negative effect on the business. I'm thinking that the CEO wouldn't take the blame for that though.

But if people's jobs were truly on the line, I'm inclined to agree with the "screw it, push it through" approach.


Sorry, the CEO's job is not to talk about nitpicky technical details. He's got no voice in what a code or a test should be.


Ok this is the problem with story telling tbh:

David: It's for Philip. It we don't do this right away, we'll have to have a layoff.

and

Judy: OK, then I'll fill out that section myself and put this on the fast track. ----- 2 days later. ----- David: What's the status of 129281?

"It we don't do this right away, we'll have to have a layoff" and "2 days later", making an impression of this is an "a day or two" task, not a "fire/emergency"

I don't think if this change finished in 2 days anyone will be unhappy. But if you took this to production, and somehow failed, everyone would blame QA/testing


Those rules that were overridden were basically QA bikeshedding about variable names in something that has never changed before, and insisting on tacking on refactoring while trying to put out fires.

The engineer did their due diligence, wrote tests to make sure it had the desired behaviour, and got it done in minimal time. Clearing technical debt in old modules should be done, I agree, but not while trying to put out fires. It adds considerable risk to a change which should not have any impact except for the request.


The ideal solution is neither always enforce the rule nor always permit lapses. Are you saying you don't think human beings are capable of coming close to that ideal solution?


In my experience, no, they cannot exercise judgment and an absolute system has to be in place.

In theory, could informed, intelligent, rational actors without ulterior motive do so, sure. I'd sleep on a couch and eat ramen for the chance to be part of such a team, but I haven't met them.


> everyone's applauding the gatekeepers who insisted on significant refactoring and the creation of additional tests before the change could be approved.

I certainly didn't applaud the gatekeeper because I believe they opened themselves up to a great risk. If you are doing an emergency patch to production you should minimize the amount of changes. All the refactoring of code was an unnecessary, reckless risk. The refactoring should move to the next scheduled release as the top priority since part of it is already in production.


The scenario in the linked post was not a "fire", and it makes sense that it would require going through the regular process. If it were a fire then where was the CEO after step one? Why didn't he (or the 'operations manager') send out an email to everybody informing them to bypass the regular process?

A fire is something like, "the site is down for XX% of customers!" (for a large value of XX) or the "the software is routing product to the wrong place!" No doubt a real fire would have been handled differently.


Everything in moderation.


That's rather extreme.


including moderation.


I prefer a slightly different translation of that Ancient Greek saying: nothing too much


For a moderate amount of everything.


I think most people here have had the experience that the agreement will be made that the fixes will come "later". And then later never comes.


> But we don't write tests to check if our code works.

I write tests to check if my code works. And tests that document how the code is supposed to work currently are usually enough to prevent code from breaking in the future.

Anything related to privacy or security should be fully tested. But for the typical startup, I'd posit that beyond that test coverage should be more closely related to the number of users and level of usage rather than to the amount of code.


The number of times my tests have 1) found bugs in my code, and 2) exposed problems with the API I was about to publish... uncountable.


More generally, testing effort should be distributed according to a risk analysis. Even a one-off statistical report can have serious consequences if far-reaching decisions are made on its basis.


> And tests that document how the code is supposed to work...

For me the most important word here is 'supposed'.

All the time I read documentation describe how code works step by step (what each 'if' does, but spelled out more verbose). And test that only test that a function does by mocking out everything else.

But I don't care reading what code does. I can see that by looking at the code. I want to know what the developer intended/expected the code to do, so I can validate it against what the code actually does. Most of the time assumptions are made with those expectations. And with those expectations you have a much better idea why a trivial refactor of a piece of logic could unearth a massive 'undocumented feature'.


> Anything related to privacy or security should be fully tested.

How often do you write code that's not related to privacy or security? As soon as you connect something to the Internet it's related to privacy and security.

The only situation where privacy and security don't matter a whole lot is if your code runs on airgapped devices with very limited tasks.


I rarely write code that connects to anything, so almost never? There's a lot of code that's not written by web devs.


> I rarely write code that connects to anything, so almost never?

Sounds hard to believe. What kind of code would that be?

> There's a lot of code that's not written by web devs.

There's also a lot more than the web that has some form of connectivity with the Internet (even if it's not directly connected it may still parse data that comes from untrusted sources).

There is a widespread belief among many that "security is important, but doesn't matter for me". The most extreme example is obviously IoT ("Who would want to hack my coffeemaker?"), but there's a lot more. The unfortunate truth is: There is hardly any code these days that is not security relevant.


I write code that interfaces with hardware (health care imaging systems), processes images, and a little computer vision here and there. Very few service calls in that stack.

I jump up 1000 levels of abstraction from time to time, and when I do, I agree that security is extremely important (FDA class III device and HIPAA compliance is mandatory.) I'm also a lead, so I have to know enough to call BS when I hear it from a team member.


> > I rarely write code that connects to anything, so almost never?

> Sounds hard to believe. What kind of code would that be?

Device drivers, compilers, and some embedded systems come to mind immediately, there are plenty of others out there. I've worked on a lots of software where the only inputs were physical and sensor based, and the only outputs were to the screen. Device didn't even physically have network equipment.


>...device drivers...

I hope you're joking but I suspect you're not. Device drivers have the highest level of privileged access in many operating systems and code quality for drivers is so uniformly lousy (certain large vendors whose names begin with "N", "A", and particularly "Q", I'm looking at you) that attempting to break the drivers would be among the first things I'd consider if I were trying to root a device.


Please see my other reply. I was addressing "unconnected software" not "software without attack surfaces". I agree that security is a concern with all software.


Well, in a sense, a device driver can usually be thought of as very connected to a highly sensitive part of a computer system. If something goes wrong in a driver it will have a negative effect on the whole (monolithic) kernel and the rest of the operating system.


Device drivers are a bad example, they can be extremely security sensitive (particularly so because they usually have kernel privileges).

There are few things where security doesn't matter. But they are extremely rare. The situations where programmers think their code isn't security sensitive are probably vastly more common.


Good point, but I was addressing the perceived lack of "code that connects to anything". I agree with you that security is a valid concern for most if not all code written.


"Sounds hard to believe. What kind of code would that be?"

You do realize there's a bajillion non-networked apps in existence? Word processors, excel, editors/IDE's, system tools (esp monitoring/backup), media players that don't download stuff, compression libraries, MATLAB-style tools for numeric analysis, and so on.


> You do realize there's a bajillion non-networked apps in existence? Word processors, excel, editors/IDE's, system tools (esp monitoring/backup), media players that don't download stuff, compression libraries, MATLAB-style tools for numeric analysis, and so on.

All of them parse potentially untrusted inputs. They don't have to be directly network connected to be a security risk.

Just pick the first example: A word processor. It is not a security risk only if you can guarantee that you'll only ever open documents that you created yourself. If you ever use it to open documents you got from someone else it needs to take security into consideration.


Sadly, almost everything you mention involve opening files, and files are often on networks. Standard modal file|open on windows allows http:// access to files on hosted servers, for example. Suddenly, all of these are potentially networked in the hands of users using the code for unintended purposes or in unexpected ways.


But you don't write those dialogs yourself. You write code that calls the system file open dialog. Then, presumably, you do some check on the file for sanity checking (is it the right type? does it parse correctly? does the internal size # equal the value the OS is claiming?). If those checks fail, you report an error. Otherwise, you process it. The source doesn't matter at that point. And if there's a glitch in the dialog for handling HTTP or similar sources, then that's on the system developers to correct, and on you to report if you discover it.


Exactly. All the concerns of networking, the Internet, and the Web go away. It becomes an input validation problem. From there, one can do a format suitable for correct-by-construction auto-generation of the parser. Or can take the common route building a Turing machine into it to fight with over time. ;)

Still not writing or patching a networked app.


OT: Have you ever read Engineering a Safer World [0] by Nancy Leveson [1]? Seems like something that'd be of interest to you. Started on it this week, my sister is in her class this semester at MIT. I'll be tackling it over the next few weeks, and then her previous text (Safeware) after the new year.

[0] https://mitpress.mit.edu/books/engineering-safer-world

[1] http://sunnyday.mit.edu/


Wow. Been studying high-assurance systems close to 10 years and just hearing about her. That's how scattered that field can be. Let's look at this.

"Started new area of research: software safety." That was Bob Barton in Burroughs B5000, Dijkstra on THE, and Margaret Hamilton on Apollo code. Maybe they mean first dept at MIT or just making status of sub-field more official. Then TCAS II. I recall reading that long ago as an exemplary work in formal specification & safety analysis but project was too heavy for me. Article says them too haha. Props to her for it & others. Article shares my view on scattered groups & methods. At least seen STAMP referenced once but unfamiliar with it. They wrote against N-version programming being re-invented... which I proposed for subversion resistance. Hmmm. I'm sure my variant is the one that works this time. ;) Also did SpecTRM at their company that looks a lot like state machine and modeling schemes I saw elsewhere in high-assurance. Not claiming a copy rather than inspiration or independent invention + convergence of multiple parties. Usually means a good idea.

Very interesting person. Thanks for the tip. Your sister is going to learn some wise things for sure given they've got sane methods and got results before. I especially liked how the article jokes about writing what she knew on high-assurance development then gotten wiser or more confused. I know the feeling where I'm redoing the foundations now with what a decade taught me. More slowly this time given I have more doubt than certainty.

Note: Just got to the last part. Wait, she was the one who wrote the THERAC paper? I just assumed it was some guy (male-dominated field) named Levenson since that name was all I saw in references to the report. Never saw it again. So, she wrote up an investigation we've been citing about software safety for decades, helped spearhead efforts to legitimize it as a field, did huge projects, and I basically never hear about her. Unreal. I'm bookmarking her stuff to go through it later.


Right, I meant that in the context of things that tests are designed to catch. E.g. almost no one writes tests to look for unsafe eval or whatever.

A good example is that I care a lot about making sure our search endpoint doesn't return private user data. But beyond that, I'd rather just know that the endpoint returns a 200 and let someone tell us if it's broken rather than have an extra three hundred lines of code to see if it's returning the correct results. If we get a ton of users then that will probably change, but for now the cost of writing and maintaining those extra tests wouldn't be worth the benefit.


> The only situation where privacy and security don't matter a whole lot is if your code runs on airgapped devices with very limited tasks.

That used to be true for the software in cars, but it no longer is. The problems that result are not the fault of the original authors; that belongs to the people who decided to bridge the airgap without thinking through the consequences.


Most client-side web app code doesn't have much bearing on privacy/security, aside from avoiding XSS and a few other pitfalls.


> We write tests to be able to change it in the future

Which works great as long as your changes are shallow. If your changes aren't shallow then you have to change your tests as well and that defeats the purpose.

Automated testing is good for freezing an interface and it's behavior; allowing you to change implementation while maintaining the same outputs. Lots of technology requires this: Networked API endpoints, libraries, etc.

But most change I encounter is from changes in requirements that necessarily requires reworking and re-arranging code that won't be compatible with the test suite.


Different types of tests. You can have tests of the whole system which should always pass. Then there's unit tests, which may have to be removed/updated (though again, this depends on your scale of units). Some you expect to fail after a change to requirements get incorporated. Others you expect to pass, and you react accordingly.


If the code is designed following the single responsibility principle, and tests are written to test behavior rather than implementation details, then each requirements change should only affect the tests which specifically test for the behavior which changes.


That's still a very narrow view of changes; the belief that you can isolate every part and that requirements changes would then only affect those isolated parts seems is pure fantasy.

I've had to completely redesign entire subsystems; break down components into different pieces; move code between different layers; etc.

Honestly as a manager nothing bothers me more than developers who try and patch complex changes into existing systems without re-thinking how it affects everything else. I have to re-factor a project that's a total mess because code was added but not removed or changed over the course of some very big requirements changes. I'm sure all the tests pass but it's impossible to follow now.

The problem with unit tests is it adds an extra layer of friction on making changes that benefit the product. You are actively discouraged from changing your design from your initial assumptions! This change friction can be seen as a benefit if you need all your interfaces to be stable (like with a library). But it's a trade off and it's not appropriate everywhere.


Yeah you can have bad architecture with or without tests, having tests does not mean you don't have to think about design! I disagree that tests lock you down though, it really depends on how they are written.


You can't have unit tests that don't lock you down to a particular structure of units. That's really the definition of unit testing.

I'm not saying anything about bad architecture. You have a good architecture for today's requirements that is a bad architecture for next year's requirements. Tests lock you down to whatever your first architecture is.


I tend to avoid the term unit-test since there is some disagreement about what "unit" mean. Some think it means testing everything down to the level of individual private methods. This will indeed lock you down. And if it paired with extensive "mocking", where you only ever test that method A calls method B when invoked, then yes, you have locked your self down and you wont get a lot of value from the tests.

I prefer automated tests of larger units or subsystems, and testing against requirements, protocols and specifications rater than implementation details would change. Of course anything will change over time, but I believe this kind of tests provides higher value and lives longer.


There are a couple of reasons why you should/ could write tests:

- for catching regressions in the future

- to verify that your code works. How else would you know that your backend service reacts properly on some edge condition

- as documentation

- as a kind of quality mark, for instance to be able to pass a code review or when writing code for an external party

Unfortunately the last reason is also the most useless and still it is the one that seems to be the main motivation for many big enterprise developers.


>> - as documentation

Thank you for including this - it's underappreciated. So many times recently I've needed to write code against some poorly documented API (not always due to lack of effort, some things are hard to document well in prose / JavaDocs, or just due to constant change), but I took one look at the unit tests and it all made sense - and I knew it was up to date for the latest work.


Tests are great for dyslexic colleagues as well. A good clear concise unit test is often easier to grasp than the best of API documentation (e.g., JavaDoc in Java).


Good point - I'd bet it applies to autism and such too.


I think it applies to everybody. I think it was Terence Tao who had a blog post claiming (and showing) that teaching by example first is the most effective.


I know this is not formally an ordered list, but it looks cart-before-the-horse-ish to have catching future regressions before verifying that it works in the first place. (This comment is effectively a reply to the OC's claim that we don't test to verify.)


The quote is not anti-test. You should read the article.


And we write tests to exercise our own assumptions about how the system will be used. It's way too often I write some tests for my code, I quickly arrive at use cases and failure cases that I hadn't anticipated while writing for the optimal use case.

Reality, on the other hand, is usually suboptimal.


yes but if you follow that too strongly you end up having to rework tests _every_ time you change something which is absurd.

You should be testing input/output and results as opposed to testing how the internal gubbins works. That's the line we have to carefully tread when making a test. The test shouldn't force the item to behave in directly the way it expects; more that the I/O is correct.


this is essentially the argument for functional tests over unit tests and while I generally agree, I think a mixture of the two is important.

Unit tests should be used for extremely small and isolated mission critical objects while functional tests should generally cover the entirety of the I/O chain. That's how I do it at least and it works extremely well for a fraction of the cost!


I've never seen any value in unit tests. They never fail. What's the point?


They're useful if the code changes, or if the code doesn't change but the runtime/compiler under it does and breaks it. This can definitely happen in large projects.

If your project breaks because of local changes I think regression tests with real data and bisecting is better and less work though.


There are a few scenarios:

1. Your unit tests fail basically every time anything changes. This is the scenario where your unit test is something like "the command line arguments are -abcd" and every time you add one you need to change the test. This makes the unit test worse than useless, but actually a source of extra work every time you change something.

2. Your unit test never fails. It just doesn't fail ever, at all, under any circumstance. It's so obvious that it should work, but someone wrote that test anyway. It's a waste to run it every time.

3. Your unit test fails when you refactor because it tested some internal functionality. You need to throw away your unit test every time you refactor. It's a waste to write one every time.

The only tests that ever show that a refactor broke something are integration tests. The 200+ unit tests in my project either NEVER fail. Except for that one that you have to keep changing every time.


How do you test error conditions? In my experience, you are typically not mocking hardly anything (ideally nothing) with functional tests and some error cases require mocking. I find unit tests helpful in this arena.


I mock things all the time with functional tests. It makes it easier to reliably test your code's response to unusual conditions (e.g. error conditions) and it eliminates a source of brittleness in the test (you can write functional tests that hit the real paypal API and run them every day but every 2nd Friday they will fail because paypal is shit at keeping their servers up).

This is in no way a benefit unique to unit tests.


> yes but if you follow that too strongly you end up having to rework tests _every_ time you change something which is absurd.

it depends how brittle your tests are. You can write tests that make sure internal stuff works at a unit level without them being so brittle

to your point testing I/O or behavior is the way to accomplish this, but it can still be done at the level of internal functions/methods


This is why I generally prefer demos and saved REPL sessions to tests. I really care about avoiding regression, not about whether or not a certain function returns or errors on certain values. It's all about not getting lost in the weeds.


'Code that works' doesn't necessarily mean 'code that only works today'. I'm pretty sure maintainability is implied in his statement.


If maintainability is part of "works" and tests are a necessity for maintainability (they are), then the quote becomes prettu weird. Substituting [code and tests] for [code that works] gives

> I get paid for [code and tests] not for tests.


The discussion was around what the right level of testing is.

> I get paid for [code that works and is maintainable], not [more tests than are strictly necessary to achieve that goal]


Yeah, that's the reasonable interpretation (which is also backed up by the full article). As usual the quote is taken out of context and/or put into a headline that's deliberately much more inflammatory than the actual article.


> I'm pretty sure maintainability is implied in his statement.

I don't know this guy, so I can't speak for him in particular. But, in general, I wouldn't be surprised if the opposite was true: too many so-called software developers give "shipping" too much importance, leaving none for any other aspect of the job. Shipping is a feature, not the feature.


Kent Beck brought developer testing to prominence in the early 2000's with his books on XP and particularly TDD.

Back in those days, there was a backlash against "big design up front", and very little respect (in general) for testing as a practice. Unit testing in it's modern form was reasonably rare.

After this Agile/TDD stuff caught on, many people ended up over-testing things. This is a pretty typical thing to do when you're learning about how much testing is sufficient. I've definitely done a good amount of this myself.

It can take a good deal of experience to know where to draw the testing lines in particular contexts. I think this blog post points at this specifically - that we should write high-value tests, and just enough of them. We also use feedback over the long-term to have heuristics of where we tend to have recurrent issues, so we can test a bit more in those areas.

Far from being focussed on "only shipping", he's underlining the fact that "just enough well-written tests" support working software - and that should be our focus, instead of thinking our job is to "write more tests" (or focus on test coverage, etc).


If by "I don't know this guy" you mean Kent Beck then you should google him[1].

This is not some random guy, he is like the "father" of TDD. And that is why the guy that wrote the article thought it noteworthy to mention his quote.

If it came from someone else it would not be that important to make such a fuss about it. But when it comes from Kent Beck then it is worthy at least some discussion.

[1] https://en.wikipedia.org/wiki/Kent_Beck


> I don't know this guy, so I can't speak for him in particular.

Literally the second sentence in the article.

> Kent Beck, respected authority, creator of Extreme Programming, TDD and writter of several great reference books, mainly at the great Addison-Wesley edition


Removing implicit side-effects and global state from code and having strong, static types can go very far in allowing this, without much overhead.

Unfortunately, there isn't a really great language that enforces working like this, while being simple enough to push onto a big team :/ (if there is, please let me know)


F# is probably a pretty decent candidate for this (it's next on my list of languages to really take a critical look/attempt at).

I've done a lot less statically-typed code than dynamic (mostly Python) in the last few years, but I was playing with Unity3d recently and had a chance to write a good deal of C# with Visual Studio.

I hit a hairy problem some time along and had to do a big refactor to support a new feature. I deleted a single line of code, followed an error trail for 10 minutes, and suddenly everything was just done.

It was a pretty interesting moment for me. I realized that statically-typed languages can really have the potential to be as or more productive than dynamically typed languages paired with a good enough IDE. (And Visual Studio with C# is about the best pairing you can get).

I've had to reconsider my thoughts on these things a bit. Sure, there are things that statically typed languages can make much harder to test or work with. (Want to stub an external provider? Okay, you're going to need an extra interface, then you'll need to create a new stub version that implements that. Want to read/write pretty-arbitrary JSON? Good luck). But there are other places where you get huge wins by bugs just disappearing by the boatload.

I'm still not on board with heavy OO/inheritance, and love the pattern of simpler struct-style constructs with just functions in functional programming, but the static typing can give a lot of wins.

I think something that gives an inherent advantage to OO languages in IDEs is that SomeThing.<tab autocomplete> makes a lot of sense and is easy to compute! I can take an object and know what I can do with it at a glance. I haven't seen a functional language with enough structure to support that simple feature yet (though maybe I'm just not looking hard enough). This is really where statically typed languages can make the most of an IDE. For some reason, that's the big thing I think of when I'm thinking about the downsides of functional programs I'm working with. The editors just seem to help a lot less (though I haven't written any FP professionally-speaking, so have less experience in general with tooling).

F# using Records with member methods looks like it might be able to get that sort of benefit though, I'll need to try that. It looks like they're just pure functions declared on immutable structs, which I think is the perfect middle ground.

A lot of object-oriented languages have taken tips from functional languages lately (map/reduce/filter is the new hottie), but I think there's a lot of benefit still to get in the opposite direction.


> I think something that gives an inherent advantage to OO languages in IDEs is that SomeThing.<tab autocomplete> makes a lot of sense and is easy to compute!

F# (as well as OCaml) offers something similar in that you'll use a lot of functions that are within modules with the same name as the type you're working with. So you can write "List." and get a list of functions (map, reduce, etc.). I'd prefer something like Idris which will disambiguate functions based on the relevant types, but at least it makes IDE support easier.

> A lot of object-oriented languages have taken tips from functional languages lately (map/reduce/filter is the new hottie), but I think there's a lot of benefit still to get in the opposite direction.

Something in particular I wish F# would add it general non-linearity of definitions. All files and definitions in F# must be strictly ordered (either type A can reference type B or vice versa, but not both) except for specific, contiguous blocks. It presents a challenge for type inference, but I think just punting it back to you for the tricky cases would be fine (and it often has to do this anyway).


I like having the linearity of definitions and files. It makes reading unfamiliar code bases much easier, as it means the code has a "beginning" and "end." To me it fits well with the F# theme of sensible defaults.

If you do need to get around it though, you can have mutually referential types in F# if you use the "and" keyword (although the definitions of the types have to be right next to one another). And in the next update to F# you'll be able to have mutually referential types and modules within the same file which is often good enough for most other things you might need that sort of thing for.[1]

[1] https://blogs.msdn.microsoft.com/dotnet/2016/07/25/a-peek-in...


I think the linearity can force you to streamline things, but personally I often have to resist the urge to bike-shed the order in which I write functions in the same file, let alone what order I want to put my files in. I end up being torn between different orders I would want in different situations, and would love a language or IDE that would let me view definitions in different orders depending on what I'm looking for.


I suspected as much, thanks for confirming! F# is definitely the plan for this weekend, then :)


I don't find dynamically-typed languages productive at all once the program gets past about 400 lines and/or you figure out what you need to do.

The break-and-follow-the-errors approach is very powerful. It's usually not hard to find the exact break that will show you all the things you need to change, and then just work through them. My record is 5 days without buildable code, working in C++; once I'd worked through all of the errors, the program worked, and without any non-obvious problems.

I miss this a lot when working in a dynamically typed language.

(Thing by Jonathan Blow that touches on this: https://web.archive.org/web/20140929232443/http://lerp.org/n...)


> Want to read/write pretty-arbitrary JSON? Good luck

Parsing JSON in Java was one of my worst programming experiences, so I have to agree with you here.

But in Rust, using the `rustc-serialize` library (and Serde, but I haven't tried that yet), parsing and writing JSON is really pretty painless. The really nice part is that you can declare the structure of your JSON data as a completely normal Rust struct (just with a derive annotation that makes it Encodable and/or Decodable), and with a single function call turn an instance of that struct into a JSON string. And in reverse, you can just parse() a string and it will return either an instance of your struct or an informative error if the JSON is malformed or doesn't match your structure. Makes JSON really easy to work with.


Jackson (in Java) gives you pretty much the same experience, albeit I'd recommend a few annotations on your object. And you can read/write arbitrary JSON into JsonNode or Map<> objects.


I think 'arbitrary' was actually the key word. I agree that serializing/deserializing structs is pretty painless, the issue is when you're not sure what you're getting.

In golang recently I had to take some json (that I only knew part of the structure of), and modify just that small subpart of it without touching the rest. It was a really painful thing to develop, and the code ended up very messy.

There were a few golang libs for reading arbitrary json, but none supported writing to it that I could find.


> Want to read/write pretty-arbitrary JSON? Good luck)

I'm surprised this myth perdures.

Reading arbitrary anything is trivial in a statically typed language: use a hash map.

There. You're merely emulating what a dynamically typed language gives you, of course, but it's trivial. And at least, statically typed languages give you the choice: you can be dynamic or static. You don't have such a choice when you don't have types.


> I realized that statically-typed languages can really have the potential to be as or more productive than dynamically typed languages paired with a good enough IDE.

Indeed, and C# isn't even a particularly safe typed language. It still has pervasive null, for instance. When you get into F#/OCaml/Haskell/Rust-type languages, it's a real eye opener.

> Want to read/write pretty-arbitrary JSON? Good luck

Not sure I see the problem. Just deserialize JSON into a JsonValue which provides dictionary semantics like JavaScript.


I agree the property reference tab autocomplete is a pretty powerful feature. It's like autocomplete in a shell, but contextual, since the system already knows what you want the first argument to be. And I've seen a video with one of the big functional guys (maybe Brian Beckman?) pining for that feature.

I'd point out, though, that is still is only part of the picture. You're holding on to a value and you want to know what you can do with it:

    widget.<tab>
will show you everything of the form

    widget.someData
    widget.doSomething()
but you're still missing out on other structures like

    freeFunction(widget)
    handler.handleWidget(widget)
    hammer + widget
    widget[part]

    // returns a widget, does this one count?
    widgetFactory.buildWidget()
In nearly any language the first two will be common, and where available the others are critical usages, too. I want to be able to tab complete them!

It's hard to continue hewing to the tab as activation with these other structures, which may be why IDEs and REPLs don't really try.


That sounds like something that would be an awesome demo, but not really useful.

I actually dislike tab complete in usual forms most times. Auto import is nice, but i feel that auto complete is a form of searching the code base. And, when I am coding seems a poor tube to be searching for the answer.

When debugging, however, jump to symbol and quickly listing alternative methods helps. And sometimes I am just searching. So, good feature. Just not something i want to rely on.


In large code bases it can be really hard to keep everything in your head, and that's when a little tab complete can really come in handy. Just forgetting specific method names when you know there's one you want to use can be very handy.


It can help. Often does. But I treat it like spell check. If I need it, I'm probably using the wrong word. And I don't know how to use it to effectively search for words.


In terms of web languages, Elm is perfect for this purpose. It's a statically typed, pure functional programming language. It's impossible for functions to generate side effects. Side effects must captured by the type system as external "Commands". Also, the type system is inspired by Haskell but far more basic. To hardcore functional programmers this is a negative, but Elm is really easy to teach to JavaScript programmers as a result.


That's one reason to write tests.

Another is that some code reviewer asks for more tests. A third reason to spend time on tests is that they're required to maintain the same standard as the shipped code, even though they're only run in the presence of the developers, and their breaking only affects the developers, not any customers.

People forget the ultimate reason for our work oh so often.


I don't have that kind of hubris. I value reliability, repeatability, and consistency. And so do my customers by tell of the issue reports we get. They're most upset when something goes wrong.

I write tests, many tests, when I'm working in a dynamically typed programming language. I write tests even when I'm working in a soundly typed language. The only difference is that in soundly typed languages the type system guarantees many properties for me so I don't test for those.

Personally I like to write tests first but I don't believe that gives me any productivity benefits. It's just the way I think.

> People forget the ultimate reason for our work oh so often.

Tests are important because reliable code is important. The customers are important but so is the business. It costs quite a lot of money to support error-prone, poorly designed software. Tests aren't a silver bullet but they are a tool to alleviate the problem.


I've heard words much like those many, many times, but always in general or specific to a context where it didn't apply.

For example when I changed some code for which tests didn't exist, so I tested what I changed and wrote some extra tests while I was at it, and it was blocked in code review because my extra tests didn't report failure in any detail. What I did said "x failed" if a test failed, no details. The reviewer said much the same as you did now to justify that additional reporting was absolutely required.

It's a fine sentiment when it actually applies, and I wish it weren't applied quite to often to justify YAGNI and other rubbish.


Can you say specifically what you object to in his words above? I'm failing to connect what he said with your complaint.

The advice you received with regards to defect locality sounds reasonable - tests that don't give much in the way of isolation can cost a lot of developer time to hone in on. It's hard to say, not knowing the exact details however.

I also find it hard to reconcile the idea that "tests are an important tool for writing good software" with "justifying YAGNI". How would an "openness for developer testing" justify an attitude of "not writing things you don't need"? Those two concepts sound almost entirely orthogonal.


Unit tests are important to that goal. But that doesn't make every aspect of every unit test important as such.

Specifically, if a test passes right away, then its error reporting isn't important today. It probably comes in useful if the test ever breaks, but will the test ever break? Therefore, spending significant time on the error reporting today is YAGNI, even if minimal version of the test is useful.

My complain is that even if unit tests are useful to a degree, people trot out the reasons for usefulness primarily when those reasons do NOT apply.


I cannot disagree more. If you are bothering to write a test in the first place, do it right. There is nothing worse as a developer than receiving a ticket or a user report saying "It broke; fix it" with no other information.

Not to mention it will literally take you 2 minutes to add the better reporting.


Ok, I see what you're getting at there.

So if it takes you significant time to get proper defect locality, I think you should see if there's a better way to approach the problem. This should be essentially the default for typical/modern unit tests. Perhaps you're writing tests that are more like system level tests?

I'd also say that if you're essentially certain a test will never break, then (other than for documentation purposes) why are you writing it? To paraphrase Kent Beck - we should only test things that could possibly break.

You might be overgeneralizing what "people" say about unit tests - I'm not sure what your specific scenario is, but there are a whole spectrum of opinions on the subject. Perhaps this is just an organizational code-smell of the place you're mentioning.

Dogmatism/cargo culting in general can be annoying however.


>What I did said "x failed" if a test failed, no details.

Yuck. Imagine if you got a bug report that just said "X failed" with no details.


The first part of your comment sounds really odd to me. I definitely write tests to check if my code works, among other things.

Why do you think that is the wrong approach?


It's not that it's wrong. Having confidence your code works is ultimately the goal. That said, if I just need to know code works, manual, rather than automated testing works just as well, and requires less effort.

Automated testing only shows its real value when you go back to change code that was working before. With the manual approach you'd have to retest everything to have any real confidence. With the automated approach you just run a command.

I'm a big fan of automated testing, but if I didn't expect to have to ever change code, I wouldn't bother with it.


"Automated testing only shows its real value when you go back to change code that was working before"

end to end tests, integration tests, regression tests, etc, yeah.

Unit tests though...usually no. Often if you're making any kind of significant change, the entire code paths may get refactored away or change too much and the test will get nuked anyway.

And it's that kind of test that usually confuses people, so it's worth understanding.

A unit test's goals are many:

It proves at authoring time that the code works. It saves you the time right away of having to go through the UI or spinning up a server just to validate a function is working. It proves that you thought about specific edge cases (and if you do testing consistently, the lack of test is your evidence of unconsidered edge cases. It's documentation of all of the things you considered when writing the code. It is an example of how to use the code with all it's use case.

And when you nuke a piece of code away, the failing tests are now a guide of all the cases you have to make sure are truly no longer necessary.

If, in the future, you do a refactor of an implementation detail (so the existing tests are still valid), then that's bonus as you get green/red validation. But in practice, that is less common than all of the other reasons for testing. That's why the "I don't expect this to change" thing isn't really a reason for or against writing tests.


>Unit tests though...usually no. Often if you're making any kind of significant change, the entire code paths may get refactored away or change too much and the test will get nuked anyway.

That pretty accurately describes my experience with unit tests. I've been part of several projects where we had pretty comprehensive unit tests (I'd say small to medium sized projects) and I never managed to get as much use out of unit tests as I liked. After one or two big refactors most of the tests needed to be, as you said, nuked anyway. While seeing all the pretty green lights is reassuring, they are rarely working when you most need them - during large refactors which blows away big sections of code.

I picture unit tests as a row of black boxes sitting on a table in certain positions. Unit tests are great when you don't move the black boxes but do change the mysterious processes are running inside them. But refactoring is rarely ever that isolated in programming since you tend to move some of the black boxes around, remove some entirely, add some new ones, change the contents. To then expect the unit tests to give you back useful information on whats broken is rarely possible.

I've had more success with e2e testing using things like Selenium, but it's still frustrating as a developer to read articles about how great unit testing is (like this root comment) and never able to actually get a decent working version of it in your projects (because of the reasons mentioned by this parent comment).


The biggest benefit I've been seeing from unit tests is in writing more of them inside a code base sorely lacking them. In that case there's a big side-effect of making the underlying code unit-testable (that is, if you don't totally cheat every time with something like PowerMock). Some tests may get blown away in refactoring, but testing the new code has the same benefit. In many cases this makes the implementation uglier but doesn't hurt its clarity much (oftentimes it can improve it), especially with IDE assistance.

Building up small oases of dumber/more verbose code that has unit tests seems like the best way to wrangle legacy code into something that anyone else on the team can understand and not mess up 6 months from now when it's their turn to have to touch it for the first time. Of course no one wants to touch that 200-line, 10-levels-of-indentation monster method, but bringing just a little bit more and more of it under unit test over time will help a lot. Every other benefit of unit testing that you listed besides the time saving aspect (e.g. why launch a big e2e test if you can test the same thing in an xunit-like context? even if it's not strictly a "unit" test) pales in comparison of the benefit of making crap code nicer to work with. A corollary is that if your code is already nice to work with, and you have a system to keep it that way, unit tests won't be very valuable. (Though other tests, which may or may not be in an xunit-like context, may still be quite valuable.)


Just being the devil's advocate....

> With the manual approach you'd have to retest everything to have any real confidence.

This kind of implies that software development before "~tdd" was a complete disorganized gong show of quality especially where refactoring, but in fact that was not the case. There are ways of coding that are more conducive to quality than others.

> With the automated approach you just run a command.

Once you've written all that code, yes.

> if I didn't expect to have to ever change code, I wouldn't bother with it.

Usually you don't, in which case the extra effort on testing is wasted, usually.


> This kind of implies that software development before "~tdd" was a complete disorganized gong show of quality especially where refactoring, but in fact that was not the case.

The question isn't "do we need to test." Testing can just take the form of running the code manually and making sure the output makes sense, but you do need to test.

The question is "do we test automatically or manually." It's the same question regardless of how good your practices are. Note that this is totally distinct from the question of whether or not to use TDD.

> Once you've written all that code, yes.

I've found that writing tests often doesn't take much longer than testing manually, and rarely takes longer than testing manually twice. Sure, if you obsessively try to test every possible case and input, you'll waste time, but well targeted testing doesn't have to be slow.

> Usually you don't, in which case the extra effort on testing is wasted, usually.

For non-trivial projects I have an average number of revisions per line of code much closer to 2 than to 1. Sure, some of the code only gets written once and never touched, but other code gets revised multiple times. If you're good at writing tests, the tests will focus on those often revised lines of code.

And, again, you have to compare the effort against the effort of manual testing, not against the effort of writing code you've never run and shipping it.


You usually don't have to make changes to code once you've written it? The only time I ever find that true is when the code is throwaway (ad hoc data crunching for some purpose, scripts to automate some one time tedious process, practice projects just intense to learn a system, etc). I don't think I've ever written anything that was useful for more than, say, a week that didn't get changed later by myself or someone else.

The typical claim is that code maintenance is at least 10x as long as the initial write.


To me there are two automated testing philosophies:

1. Write tests to prove your code works, which is sometimes referred to as "test-driven development"

2. Write tests to catch any side effects or regressions when altering code

Funny thing is that if you write good tests, then the results are pretty much the same regardless of your motivation.


You have to:

1. Have an overall idea of what your software will do before writing the first line of code.

2. Challenge and change any touched assumption from #1 during development when you refine that idea.

3. Test that the program satisfies your refined idea after it's written.

4. Create some assurance you'll keep #3 correct while you write any further code later.

However you fulfill those needs, if you got them all, you are good.


I spent some time working for one of the biggest names in trading technology and one of the things I learned is that if your code doesn't work on more or less the first try, your design is probably wrong. The implications of that lesson are profound.

When something doesn't work as expected, I now check my design and not the code.


> if your code doesn't work on more or less the first try, your design is probably wrong

Or you're tired. Or you're just not as focused as you could be. Or you have a deadline. Or you're trying something new. Or you're not fluent in the language yet. Or you're fluent in the language but not the framework. Or you're fluent in the language and framework but not the design pattern (if you use those).

Or a whole host of other things.

I'm not saying spot-checking the design isn't a good idea, but saying that it's the design more often than the code just doesn't match up with my experience.


The entire idea is to increase the probability that you are correct. So if you're coding tired, you're decreasing that probability. If you're using a framework you're not familiar with, you're decreasing that probability. This is why I mentioned that it was in trading technology: they only get paid to be right.


How do you define "the first try"? Surely you're bound to have various trivial issues?


Something along the lines of after fixing obvious stupid things like those trivial issues.


I remember once my code worked on the first try. I found it difficult to believe.


One time in uni, I had no access to a computer and wrote an entire b-tree implementation on paper for a course, typed it in and it worked.

Based on a true story


In principle TDD is the opposite of this.

Out of curiosity, did you use any debugging on paper for your code?


I can't recall, honestly


I've found it to happen very often the past few years


Did this comment really deserve a downvote? Upvoting accordingly!


Maybe not, but this one certainly does.


> Why do you think that is the wrong approach?

"Because you should write code that works in the first place!" -- your boss


Absolutely. The article itself (it's a short one) talks about doing just enough tests to have confidence in a product, and I think that's fantastic for early phase development.

But when changes can come from anywhere (lots of developers potentially making changes) and the need for changes can be urgent, there's a great deal of value in complete code coverage.

Not every developer thinks the same or has the same tendency for errors. I would caution people against thinking about tests as personal verification. With any luck, you'll be handing off that code base to someone else in time.


When I start thinking of some of my code as safety equipment, and some like shop tools, a lot of my decision processes become more straightforward.

My tests and practices are, occasionally, like a five point harness in a race car. Without it anchoring the driver in front of the wheel, they could never ever drive that fast with any safety at all. This one comes out as a counter to the 'straightjacket' argument some people like when the tools won't let them just write shit code that the rest of us have to babysit while they run off somewhere else to write more shit code. Which we are supposed to be grateful for... I digress.

But most of the time my tests are more like smoke detectors. The ones that go off for no reason get replaced or removed. The ones that go off after I can see fire and smell smoke? Why am I paying for upkeep on those exactly? What's the value?


I have seen codebases where the developers diligently insured that every getter and setter method had a test case. They had great test coverage, but not particularly reliable outputs.

OTOH, projects with excellent end-to-end tests tend to have lower coverage but better regression management.

I think this is the point of the article?


I think that this is one of the points.

I think that full test coverage is nor sufficient or necessary for the working software. What matters is the sufficient covering of the edge cases.

I think that an additional point to keep in mind is that most probably Kent Beck talked mostly from the point of statically typed languages.


That's not a bad perspective, but the counterpoint that I have seen several times is that the tests can tend to take on a split quality where some will insist that nothing can be changed if a test has to be changed along with it - and then the other camp steps in and doesn't want to put in the effort for new tests, so the existing ones get removed.

Testing makes a lot of sense if you're doing data munging, but for front end or other kinds of code that are almost defined by their ability to create side effects, it's usually a waste of time. In my humble.


Technically, we write tests to document what the code does. We could use any language, including english, to do that. However, using a programming language carries the side benefit of being able to allow automatic verification that the documentation is at least in sync with the code (although not necessarily accurate with respect to the project goals).


And then who or what documents the tests? Test code is code. That's important and all too often that fact is sort of elided.

"We" write comments and use "plain" language documentation, or formatted but otherwise plain language for documentation generators to parse to document our code, and that includes "what the code does."

I'd argue that if your test code is what you're relying on to "document what the code does" then you are probably (almost certainly) over-testing, testing the wrong things, or some combination. Oh, and also using test code incorrectly (as documentation).


This is entirely subjective as it comes down to Kent's sense of what needs to be tested, which is arguably a better 'sense' than that of many other programmers.

Do you allow your juniors to use their sense? No, you make them hit 100% coverage until they start to learn.

As Paul McCartney said: "Learn the rules like a pro, so you can break them like an artist."


"But we don't write tests to check if our code works."

If the input data is complex enough to drive a non-trivial branching logic in our code - yes we do. Well, at least I do. The bonus is that it provides the fixed point for further changes at the same time.


We write tests to be able to change it in the future with certain degree of confidence that we don't break anything and - if so - what exactly

You don't want tests for that - you want a type system and a static analyzer.


A type system and static analysis certainly allow you to stop writing a certain category of test as you no longer need to assert that a method being passed an object of the wrong type copes with that. It doesn't completely relieve of you the need to test though as a type system won't assert that the code you're writing actually fulfils the businesses requirement that nothing can be ordered for same day shipping after 11am.


You absolutely can do that with the type system, use a constrained subtype.


You've got my interest piqued now. How does that work in practice? And do you not still need to test that it complies with the actual business rules at some point?


How does that confirm to you that, say, your ratelimiter code is correctly processing the first three requests in a given minute, rejecting the fourth, and approving a fifth once another minute has passed?


But only tests will tell you you are calculating something correctly.


But thats anathema to agile, yo. /s


So then why don't you just write the tests BEFORE you are about to refactor something or change some interface? Why write them at "launch time" of the feature?


Because if you haven't written them, odds are the code is not particularly well suited to be testable. Side effects out the wazoo, tight coupling, etc.


That only works if you run the tests often. Writing them is less important than running bits of your program often enough to discover how it really works.


Tests can be seen as a form of executable documentation, or a executable specification, a sort of self-enforcing contract.


N00b question, what's another technique? I mean if tests are assertions on assumptions is there something better?


This is the right answer. I will also point out that as a result of this property collaboration becomes easier.


> But we don't write tests to check if our code works.

So what methods do you use for that purpose?


I'll put the question to the other readers:

How often do you find, despite having written tests, that there is some bug in your software? And how many of those times did you think that you should have considered it beforehand, rather than that it would be impossible to foresee?

In my experience the most useful tests are the ones that came from some unforeseen bug, which was then fixed and a test case built around it, so that it wouldn't get "unfixed".

The least useful tests are the ones for cases you know not to invoke, because they are obvious. Like how you know when you divide by a variable, you know it can't be zero. So you make sure it can't be zero, making the test case a bit moot.


Perhaps I'm just an outlier, but I think every time I've gone in to write tests for my code (usually post-testing rather than red/green testing), it's rather quickly invalidated some assumptions I made when writing the tests.

"Oh yeah, that's going to need to be thread safe."

"Oh yeah, that might be nil. Rather frequently."

In my experience, testing has been much more beneficial as an exercise of my mental model of the code and its interaction than for refactoring. But to that end, I can think of quite a few times that I've been very grateful for a unit test suite while I did a large refactoring.


I've had a very similar experience.

It's worth it's weight in gold when you can flesh out edge cases and logic issues before going all-in on an architecture, and being able to refactor without fear is a great bonus, but it's not always applicable.

Unless the majority of the tests are "external" (i.e. testing an API by calling it's endpoints and testing the return), a major refactor is going to require updated tests, and it's all too common that I see people take the "easy" way out and fix the tests to conform to the way it's working after the refactor rather than the way it should work.


>Perhaps I'm just an outlier

If you are, then I am as well, because this is where a huge part of the value of tests come from - validating assumptions early and quickly and cheaply.


I agree, tread safety and race conditions I got hit a lot, other than that test coverage provided a lot I guess


  > tread safety and race conditions
I feel like this comment wouldn't be out of place in an F1 thread.


I've found that this depends heavily on the software in question.

For most of what I do (basic line-of-business type webapps & servers) the vast majority of the code works on the first try and is obvious enough that I never break it; during development there tends to be a few sections that break repeatedly while I'm changing other things, and those are the sections I write tests for. (Generally this is a few percent of the entire codebase.)

On the other hand, more 'computer-sciencey' projects tend to need a correspondingly larger proportion of test cases. The most dramatic example of this was when I built a Lisp interpreter as a side project: this was the only software I've ever written fully test-first, and I don't think I would have been able to get it working at all without a full test suite that hit every line of code at least twice.


As a follow-up question, did you (or could you have) develop that lisp interpreter on the purest TDD approach of:

- Write a few tests that fail

- Write code until your tests stop failing

- Repeat until the program is done?

In my experience, problems that require writing the tests first normally require writing all the tests first. If you start solving only a subset of them, you will have to rewrite most of the code once you start looking for the other tests.


> did you (or could you have) develop that lisp interpreter on the purest TDD approach

Yes, that's what I meant by "fully test-first". I was following along with Paul Graham's The Roots of Lisp[1], so I had a convenient set of pre-made test cases. I would start by copying his examples into a test case:

    it('exec atom on a symbol is true', () => {
        var symbol = Symbols.get_symbol_named('foo')
        expect(exec(empty_scope, [natives.atom, [natives.quote, symbol]])).to.equal(true)
    })
    it('exec atom on a non-empty list is false', () => {
        expect(exec(empty_scope, [natives.atom, [natives.quote, ['a']]])).to.equal(false)
    })
    it('exec atom on an empty list is true', () => {
        expect(exec(empty_scope, [natives.atom, [natives.quote, []]])).to.equal(true)
    })
And then write my code:

    natives.atom = new Native(function (item) {
        var i = exec(this, item)
        return (i instanceof Array && i.length == 0) || Symbols.is_symbol(i)
    })
Once I got through the paper that way, I had a complete Lisp interpreter working, and wrote a few small programs in it with no trouble at all.

> In my experience, problems that require writing the tests first normally require writing all the tests first. If you start solving only a subset of them, you will have to rewrite most of the code once you start looking for the other tests.

This has been my experience as well. As I say, this is the only program I've ever written fully test-first; I think it was only possible because I already knew exactly what the inputs and outputs were going to be (having them well-specified by virtue of being a Lisp) and which ones I needed to implement to be able to get later ones working (with the help of the paper).

Generally I find it's not possible to work this way; my usual approach for solving problems where the code isn't immediately obvious is to write (and generally rewrite) in parallel:

- A set of usage examples and/or documentation, to clarify what exactly I'm trying to do and make sure the interface makes sense.

- The actual implementation (or, in early stages, some pseudocode).

- The tests (if any), to check that what I've written actually works. Often these are the same as the usage examples.

Frequently questions raised while writing one of these will influcence the others, often significantly; often I'll only think of an edge case while I'm writing the code and then have to go back and add it to the tests, or realize that the code could be simplified by changing the way the API works.

[1]: http://www.paulgraham.com/rootsoflisp.html


I'd wait until the Lisp interpreter works significantly and then write reasonable code in Lisp like:

  (it (atom 'foo) t)
If it fails, it just says something like:

  test failed: (atom 'foo) returned nil; expected t.
So no need to have a string there. Save the "blub-level" testing for things that are not testable through Lisp. (Or, really, things you are justified in not wanting to expose such that they are testable.)

(Sure, it could be the case that both the interpreter are wrong, and the atom function are wrong such that the test passes. That level of breakage isn't likely going to pass much of a significantly detailed test suite.)


If I'd waited until the interpreter to work before I wrote tests, I'd never have gotten a working interpreter. For atom the implementation is obvious, but the more complex functions were riddled with bugs on the first implementation, and the lexical scope handling took about three complete rewrites before I got it right.

If this had gone any further than the three days of off-hours time I spent, I certainly would have rewritten the test suite in Lisp; however it was primarily an academic exercise to really 'get' how Lisps worked and push my comfort level; writing all of my tests in the language I was trying to write would have pushed things a bit to far.

In fact, I never got around to writing a parser, so even if I'd written them in Lisp it would have looked like this:

    var Symbols = require('../Symbols')
    var natives = require('../natives')

    var foo = Symbols.get_symbol_named('foo')
    module.exports = [
        natives.tests,
        [natives.it, [natives.atom, [natives.quote, foo]], true],
        [natives.it, [natives.atom, [natives.quote, []]], true],
    ]


I did this with brainfuck.


There should be a law banning people from asserting the constants have the expected value in tests.


I was taught that there should be tests against things like class constants so that you have a failing build in the event that someone changes a const that's part of the specification.

This was from some TDD course that my employer brought in where they emphasized that consts that are part of a specification or requirement should be tested against to prevent them from being accidentally changed or changed without full consideration of side effects/review against the program spec.

Is this what you're referring to?


I guess so but I disagree with that teaching. All it does is make me change it in two places.


That's sorta the point - to make you think before you change. A constant without a test is part of the implementation, and you can refactor as necessary to make the implementation perform better. A constant with a test is part of the interface, and if you change it, client behavior has changed and the tests should change with it.


This is true only for a very particular kind of constants, like the ones defining API error codes (eg. define PAGE_NOT_FOUND 404). And those should be covered by integration tests, not unit tests.


Yes, OK, but what I'm telling you is I think it's just as mindless. If anything, I look less at the tests when I know I'm going to break fifty of them with any real change at all. "Yeah, yeah, whatever" and copying and pasting the new values is the more likely behavior in this scenario.


Do you mean test explicitly and fail graciously? So even test over false inputs, but check that the code fails with the right exit?


I mean if you have a method whose definition is literally just "return 'abc'" and then you have a test assert that that method returns 'abc', knock it off.


I call those types of test, "testing that the compiler works." Not really something you should be testing! It also applies to testing code that is just a wrapper around third-party libraries.


True ;-)


I see bugs in my code very rarely nowadays, and they're always things I should have seen beforehand. In my experience if you're seeing edge cases at all they're a sign that your model is wrong - the business processes we're replacing with software are always simple, there's really no reason for code to ever be complex. The problems we get are those we introduce for ourselves by overcomplicating things in code.

I do think of myself as using TDD, but I use the tests to drive the model and often end up not needing the test - a bit like https://spin.atomicobject.com/2014/12/09/typed-language-tdd-... . With experience, more and more I'll skip past the writing a test and deleting it after, and go straight to encoding the properties that I actually need in my types.


Testing for me is never primarily about getting it right the first time. Although writing tests in parallel with coverage can definitely help to validate your assumptions, particularly with negative testing that you might not test manually.

The greater value of the tests is later on, when you need to make a change, and want to know whether you broke some existing functionality. Or, once you have a framework of tests around something, being able to quickly write a test that fails due to a discovered bug, and knowing that you fixed it by passing the test.


From the receiving end (long history of "ops" work), I've seen that a lot of the code itself tends to come out really well and the developers who write the code are a lot happier.

That said, many of the feature requests from the ops' side end up taking an extremely long time to produce, if ever. Very basic requests that would save the team dozens of hours per week go on and on longer than they should. In these sorts of shops, the ops side ends up pretty miserable and builds up a lot of frustration for the development team, which drives the two groups apart.

More recently, my team's been asked the ops team is being asked to write tests for ops' prod bugfix automation. Categorically speaking, we're not developers, so it tends to take us longer to write tests than the devs who write them daily. Issues pile up and span across months, rather than days. The reasoning behind the tests is understandable since it helps build assurance that our scripts won't break the world. Though, the ops team is battle hardened enough to know that if we break it, we have to fix it - it's in our best interest to write code that's reliable as possible. I don't think any of us would be in our roles if we didn't understand it. Frustrating stuff.


Yes, and a special case of this, which is rarely correct in "I don't get paid for tests" code, is tests that make sure errors are reported or handled correctly. It's rarely a good idea to ship a NullPointerException just to see if you correctly turn it into an error code or something.


I TDD a fair bit, and both kinds of tests are useful. I do really love the tests that I write to replicate a regression bug. The other types of tests are more like living documentation of my internal APIs for future developers, and are still useful.


you're describing two different types of tests.

one kind is the acceptance test, where you write a test that specifies correct behavior under normal circumstances (including expected error scenarios).

the other kind is a regression test, where you reproduce a discovered bug in a test environment, then fix it, and make sure that the test covers the fixed code to confirm that the bug is no longer occurring.

both types of tests are important and for different reasons. they aren't mutually exclusive and they aren't addressing the same problem.


Similar question can be, which kind of tests failing most frequently, this is a good insight on what kind of tests you should focus on your team


>>The least useful tests are the ones for cases you know not to invoke, because they are obvious.

Maybe you know not to invoke them because you wrote those features, but what about the rest of the team? Or the poor schmuck who takes over your codebase after you leave?

Test cases help future-proof your code against the unknowns and the unexpected. I think of them as guard rails on a path. When the path is on straight terrain, they aren't necessary. When it is crossing the side of a cliff though, you're glad they're there.


People get too hung up on the question "to test or not to test" instead of asking the question "where and when should I test".

I started my career writing iOS clients, and the obsession with TDD was baffling. 80% of my code was usually either UI or simple Core Data manipulations, while the last 20% was mostly API parsing and a touch of business logic. I wrote a few tests for parsing corner cases or business logic, but they never really gave me any confidence or helped with refactoring, instead taking up time and adding overhead whenever I made changes. I supposed I didn't have enough coverage to get the benefits, but what tests would I write for my UI? What tests would I write for simple Core Data queries (which is assuredly unit tested already)? What tests would I write for my parsing libraries (which are already unit tested)?

Then I started working on the (Python + Flask) API backend, and tests were self-evidently necessary. Python is dynamically typed, which can result in lots of corner cases when doing simple data manipulations. Python is interpreted, which means the compiler/IDE won't warn you about syntax issues without running the code, and you can't catch even the simplest logic errors without running the function. Most importantly, the API's entire job is translating data, inputs are in the API parameters or database, and output is the JSON. It's a perfect function, and tests were obvious. I wrote something like 600 in a week, then used them to make some major refactors with confidence.

What I learned from those juxtapositions was that unit tests and automation are invaluable in certain circumstances. Specifically _any system that creates machine-readable output_ like JSON, populating a database, or even a non-trivial object factory should be unit tested like crazy. Any system that creates human-readable output, like views or changes in an unreachable database (something like an external API or a bank account) needs to be human-tested, there's just no way around it.


Mostly anything a human can test, a machine can test even if it requires end-to-end UI-automation integration testing. If there's a test that's part of the suite that is worth running every time something changes, it's simply a matter of cost: How much time will automation take, how long will the automation last, what is the cost of maintenance, is that less than the cost of a person doing it manually?

People tend to be bad at forecasting these kinds of costs, its easy for prejudice for or against test automation to cloud one's ability to be impartial in the forecast.


As someone who worked as dev (4yrs) later moved as tester (4yrs) and finally returned to dev again.

Here's my personal thoughts/experiences:

- Testing job is underestimated.

- In General, Developers considered superior to testers.

- What makes Tester position difficult is 'repetitive tasks' . Yes you can automated tasks, but you still need to do some tasks that can't be automated. These are manual & repeated tasks, often boring.

- Some developers are so lazy. for ex: while testing we found 'python syntax issue!'

- Management thinks testers can be replaced once they automated everything. Obviously they push for this.

- I know for sure, there are projects with passionate developers but no-one can really take care of their testing side.

- Dev underestimate/avoid unit-tests & rely on testing team to find basic issues.


Don't blame the devs completely. Not many bosses give out raises or bonuses for good testing. Or even for not shipping bugs. Being a hero and fixing the bug you shipped gets much more visibility and accolades.


I'm not blaming all devs, but most devs :) But i guess there is wide-spread believe with Managers/bosses, testing is secondary to development.

I've seen testers, who find bugs and also fix them but they don't get credits they deserved.


There is a quote out there about how people who put out their own controlled burns get more visibility than those that are sweeping up dead leaves and paper from the factory floor. I can't remember where I saw it or who said it.

That said, I do remember this one: "an ounce of prevention is worth a pound of cure." - Ben Franklin.


I've noticed that, too. Shipping bug-free software almost hurts your career.


In the last company I worked, we used people completly unrelated to our field.

They were cheap and since they didn't know a thing about the software, they would mindlessly click through the test-plans and find many bugs.

Problem with them were, that they had a "half-life", when they learned too much, their bug-finds would decrease, because they would get more careful when using the software.

Often we would simply get 2-3 students of a non-technical field and replace them with new ones afer 1-2 semesters.


It depends on the context, but of course developers are considered superior to the tester. That's like saying doctors are considered superior to nurses. If you just look in terms of knowledge required, the amount you need to know to be a developer is significantly higher.


IHMO, thats inaccurate. Here the generalization, is tester can only do 'user-end' point of (black-box)testing. There are jobs which requires tester to do 'white box'. This involves knowledge as well as doing repetitive works.

What I found was, most developers unwilling to do some basic testing for their modules. Even if you find bugs with logs & reproducible steps, they push their dev task to tester again like " can you perform 'git bisect' to find out the patch caused this bug"? or "Can you run 'tcpdump' from your reproducible steps?".etc.

worstcase is sometimes, software 'design flaws' (for example: arithmetic operation software missing division operations) are blamed on testing team for find it later.


I take issue with that analogy. In a modern hospital, doctors and nurses work together - one can't do their job without the other and they have different training and approaches to patient care. The doctor doesn't simply have a superset of knowledge that encompasses everything a nurse knows or is trained to do. So although there might be some strict hierarchy in terms of who is in charge of patients care, it's simplistic and disrespectful to say doctors are superior to nurses.

Source: have worked in hospitals, raised by MDs, have asserted similar things in the past and have been corrected by trusted mentors.


  > for ex: while testing we found 'python syntax issue!'
Do you not at least have CI to run unit tests?!


He claimed he ran unit tests, but later made a small change to the code. Thought its unnecessary to re-run unit tests.

Exact conversion is like 'I tested code & after that made a small 1-line change & looks like missed something, sorry about that, pls don't tell it bosses :P'


I would strongly suggest you persuade the powers that be to set up some sort of automatic testing. It's not difficult to set up and it will save you from so much heartache down the road.


Yes,thats right. This happened few years back while working as Tester.While I was using QA, we focused/worked more on automated testing. Now switched companies & later moved from QA to dev team.


Testers to programmers is like cashiers to cooks at Mickie Dees. I think you are resentful that your job can be automated and thus, the value of what you do is quite low.


Automated testing isn't really a substitute for actually having a person use the application.


Are you suggesting a Mcdick's "cook" position cannot be automated?


I believe the implication is that most of the cashiers will be completely replaced before it is worthwhile economically to try to reduce the number of cooks.

Automats have been around since 1897.


Testers are like McDonalds cashiers. They will eventually assimilate with the self checkout. Now, they may be able to get a job servicing the soda machine if they count their blessings.


And that's a fantastic observation.

When you get out a bit of the IT world, you'll find that people who demand software want to receive something: software. They bought an app, they want the app. Simple as that.

If you are good enough to have your code working without tests, good. If you don't need documentation, good. If you paint your walls with use cases, good. All that doesn't matter, if you deliver the app you were hired for.

And if your app doesn't work... well, everything you've done doesn't matter either. Because you were hired to deliver an app.

Of course tests are good, documentation is good, self-documenting code is good. But only for the IT. For non-IT people who's contracting you (you can be your company, too), he just want the app. The software. Working.


Sure, but they also expect that the 'non-software' bits work OK/that you've thought of not only that the software works now, but at some point in the future as well.

If you tell people "You told me you wanted software, not MAINTAINABLE software :-)" they'll say "Aren't you a professional? Shouldn't you just be doing this stuff? Isn't it just implied?"

So yes, they're paying for 'the software', but the tests are part of it, maintainability, as well as things like security and scalability are things that should be considered as well as just whether the software 'works'.


Yes, I do agree that it's expected. But I also bet that very few non-IT people have the slightly idea on what it requires. So, let's say after 6 months that any contract ended, they'll need some update.

They'll go to the market and ask how much it costs for someone to include a new functionality. They'll get $100, $90... and your bid. You know you have all those tests and documentations properly done, then you can charge just $10 and win. And you can charge $70 and also win. It's up to you, because you know how hard/easy it'll be.

And if the software was made by someone else, who charged less in the beginning? You don't know how worky it'll be, so you have to charge a bit higher, like that $90.

So, from the client perspective, the difference from a well tested software and a barely-works one is just the initial price. After all contracts are over, they (remember, non-IT) can't know if the new functionality has a fair price or not. And who didn't developed it can't be sure about the maintainability, either.

Having tests (and everything else) was good for you, for your future. But, in general, the client didn't knew all this. He just paid for the software...


That only works for "app as a product" deals, in which someone pays you to deliver them something. For many (most?) cases, the development is continuously ongoing, you have a salary and work as part of a team. Clear code with tests makes development more efficient, and is a better use of your employer's dollar.


> "If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it."

But for those of us who work on a team, it's far more complicated than that, and you have no idea who might be touching your code in the future.


"When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong."

I think addresses that point quite nicely?


Sort of, but imagine this scenario:

I very rarely have issues in problem space A, but I don't really test for A. Maybe a few here or there but it's a small minority of my tests and the coverage is probably just incidental. I test heavily in problem space B for whatever reason.

I get hit by a bus, and you're my replacement.

You have no experience with problem space A, so there are few if any tests around that space. All of a sudden the coverage is misleading because we might have 100% coverage or close to it, but the thoroughness isn't there and suddenly releases are less reliable and support ticket volumes increase.


Not necessarily. In many firms the turnover rate is significant enough that you have no idea who will be working on your code in six months or so. Unless you are on the sort of team that never changes, you can't really use your past experience as a guide for a future team's strengths/weaknesses.


If your goal is to test every possible scenario now or future I think you'll eventually find it's not realistic.


Sorry, my thoughts stopped short. It does in part.

But the collective mistakes of a team aren't a predictor of potential future members of the team. For example, we started hiring "junior" programmers, which increases the skill gap.


It does if team and their strengths/weaknesses stays the same for the life of the codebase. If it changes, then that property goes away to some degree with missing coverage against someone's weaknesses. Better to have QA cover correctness against whatever would undermine it.


Also: how long do you maintain it.

Even if I'm alone in my team, When I refactor code after say 10 or 15 years I could just as well have been another person. I don't remember. I don't have the same skill set.

And the thing is: when you first write the code you usually don't know how many years you will maintain it. That might be a reason to postpone some test writing a while initially (if the code is scrapped in 2 months what whas the point of effort for maintainability?)

However there are upfront benefits with writing at least some tests for everything, in that it (usually) helps design a less coupled system.


I aim to test for the version of myself that's having a bad day (tired, rushed, wrong caffeine level). That's a better target market than the idealized version of myself that makes only expert-level errors.

That seems to cover the team element, too. Even if I'm not having a bad day, someone else messing around in my code might be.


> That's a better target market than the idealized version of myself that makes only expert-level errors.

Not only that, but you have to be able to test against yourself a month from now or a year from now, when you've completely forgotten some aspects of the design and might overlook something doing a modification. I write a lot of report generators at my job, so there's quite often a bug or a new feature in a report I haven't touched for months at a time.


To be fair, the person quoted by the author does reference this in the article. He says he changes his frame for how much he tests based on his team's historical challenges.


That was my favorite line in the article, and one of the most resonant thoughts about programming I've seen in a while. It's a big part of the reasons I choose to mostly work alone, except when I really need the money. There's a mode of thinking and working that's both joyful and effective, and a lot of it is which particular things are actually concerns/strengths/weaknesses for me. Every time I work on a team, I have to contend with the union of everyone's concerns: the juniors, the seniors, the managers, their managers.... That mode is stress and misery.


Quite.

There does seem to be a lot of resistance to the idea of handing problems to solo programmers right now though. Any hints as to how you've made this work?


Literally addressed two sentences later.


I asked one time to my teammates why we had test. They just didn't answer. For them it was just a dogmatic approach.

That doesn't means that you should not have them. But you at least should be able to answer that question to be able to evaluate the value that they bring and how much tests do you need and where.


I like Beck's vision for the future, and I agree that we should keep experimenting in order to learn which tests tend to work and why. But we don't need to do it all manually -- we can use computers to automate and speed up such experiments. To that end, I've started a project to automatically generate unit tests from C++ source: https://github.com/dekimir/RamFuzz

Right now the generated tests are pretty superficial and silly, but the key is that they are randomized. Because of this, we can run millions of variations, some of which will prove to be good tests. Right now I'm hacking an AI that will pick those good instances from many random test runs. If it works, we'll be able to simply take source code and produce (after burning enough CPU cycles) good unit tests for it. This will be a huge help in letting the human programmer only do "enough" test writing -- the AI will take care of the rest. Additionally, the solution can be unleashed on cruft code that no-one dares touch because of a lack of tests and understanding.

(Yes, there will be a business built around this, but that's for next year. :)


It doesn’t make sense to write no tests at all but I understand this sentiment based on problems with testing that I have seen before:

- 1. Test infrastructure is too complex. If I have to create a bunch of config files, obey a questionable directory structure, etc. before I can even write my test case, there is a problem. There should be very little magic between you and your test front-end.

- 2. Test infrastructure is too lacking. It is also a problem to have too little support. There should be at least enough consistency between tests that you can take a look at another test and emulate it. There should be clearly-identified tools for common operations such as pattern-excluding "diff", a "diff" that ignores small numerical differences, etc. depending on the purpose.

- 3. Existing tests should not be overly-brittle. Do NOT just "diff" a giant log file (or worse, several files), and call it a day; that means damn near any code change will cause the test to “fail” and create more work. Similarly, make absolutely certain when you develop the test that it can fail properly: temporarily force your failure condition so you know your error-detection logic is sound.

- 4. Tests should not be overly-large. Do not just take some entire product and throw it at your test, creating a half hour of run time and 40,000 possible failure points just because it happens to cover your function under test. It is vital to have a small, focused example.

If your test environment has problems like these, I fully understand the desire to balance time constraints against the hell of dealing with new or existing test cases, and wanting to avoid it completely.

And if you’re in charge of such an environment, you owe it to yourself to devote serious time to fixing the test infrastructure itself.


I think the biggest problem with TDD is that there are two types of code, trivial and non-trivial.

I think testing trivial code is a waste of time and does nothing but improve coverage numbers.

When you think about a non-trivial problem to write tests, you don't always know what the final code will look like. Maybe you forget an edge case or some small detail in the requirements that will cause you to restructure the code and approach the problem in a different way. In which case, you now need to re-write your tests. You might as well just write tests around the final version.


What happens where trivial code later needs to grow? What happens when trivial code is invoked by non-trivial code and you need to make a change?

If it's truly trivial code, you should be able to test it trivially, so I'm a little unsure of why this becomes a make-or-break issue for some people. Pretty sure more time and energy is wasted determining if code is trivial and needs to be tested versus just testing the damned thing ;-).


Trivial code does not grow, it gets replaced. If the replacement is non-trivial, test it. If it's trivial, don't. Anyway, you've saved some useless tests that would need to be rewrite on the first change.

Trivial tests may be trivial, but they are numerous: their need grows exponentially with code size. And they generate almost all the false positives you will get.


Tests are code and code is a liability.

The DirectX12 api is non-trivial code; their testing has little in common with TDD.


Depends what you use to categorise something as trivial - some functionality may be trivial technically but have an incredible impact on end users. Is that trivial or non-trivial?


Maybe you forget an edge case or some small detail in the requirements that will cause you to restructure the code and approach the problem in a different way. In which case, you now need to re-write your tests. You might as well just write tests around the final version.

Maybe, but I'd wager on the whole that you probably don't. Your surface API should be the same, and you shouldn't be testing internal implementation anyway. If you write clear and consistent interfaces, then forgetting a requirement will mean adding a test – not rewriting them.


you shouldn't be testing internal implementation anyway

Why not? When my big black box stops giving me the expected answers I want to know exactly which cog inside that box that broke.


My view would be that if you are struggling to get useful information out of tests to that extent, then your black boxes are too big!

I agree that there's nothing worse than a test that says "your stuff broke" and provides no additional information why. But there's a balance between that, and testing the internals of an implementation.


Because tests should not be more verbose than code. Refactoring the sumArray method to use map instead of a for loop shouldnt break tests. it is terrible terrible practice to take internal implementations and then stub their callees to make sure that the code that was written got executed. I saw this a lot at Amazon. Its a fundamental type of moronic testing and is absoltely useless. All it serves is to make any kind of refactoring or changes twice as burdensome.


Any yet it's extremely valuable when you want to test code that deals with externalities (aka, i don't need to actually insert a row into a db over the wire to make sure my biz logic works, and in some cases like error handling testing, it'd be way harder to actually code the situation rather than to simulate it), and tends to give you a baseline set of reassurance that can run very quickly. So IDK about just hauling off and labeling it as "moronic".


checking for an insert into the database in the most general fashion is okay. checking for the exact dynamic data?? you are basically copying your code.


I'm just saying I can cut test time dramatically by not testing that X db actually can do it's job in expected ways, and those tests can run sans networking and unexpected situations won't cause false negatives. Both things are fine and they go very well together but each allows you to do different things


exact tests on implementation at the "db was updated properly" category are usually classified as integration tests. theres nothing wrong with those. but one shouldnt test specificity of internal method code for small units.


> Refactoring the sumArray method to use map instead of a for loop shouldn't break tests.

What kind of test could you possibly write that would break because of this? Something that measures time or memory usage, or uses reflection?

If you have clearly defined interfaces between significant internal components (which you should), why not test that they do what you ask? Public vs. internal is often more about what is exactly the client of the code, not the code's function or structure itself.


A stub on the iterator interface would do it..but like I said..terrible use of testing


What do you mean, you should not be testing internal implementations anyway? Does that also mean you don't do unit test, only integration or end-to-end ones, because unit tests are tests of the internal implementation of your software?


Not quite – most projects have internal interfaces which are relatively stable, and tests should be against those, rather than the internal implementation of those modules.

Unit tests are great, but they should still be testing at the boundary of code modules – if changing the internal implementation of that module breaks a test, it probably isn't testing at the correct level.


He means you test pure functions and mutations. You dont stub data sources that your implementations use to see if they were called. tests need to be about enforcement of correct interface exposure. not exact implementations. its one of the most facepalm types of useless testing that ive seen littered around Amazon's Seattle teams due to these Indian managers wanting to make sure the data sources called in every method actually got called and that there is 100% coverage. serves no purpose except to basically make two copies of the code..the test as a doppleganger of the actual code but using stubs.


I'm glad to read that. In my opinion, the problem starts when tests become a religion, e.g. forcing to put unit tests everywhere, no matter if it makes sense or not: just put tests, in order to justify that if whatever goes wrong, you can use the excuse of "it fails, but at least it is test covered".

In some case unit testing is necessary, e.g. for ensuring that a hash function works exactly as defined. However, there are other cases where unit testing is absurd, and black-box API tests or automated tests could do a better job on error coverage. As an example, imagine the Linux kernel filled with unit testing everywhere: plenty unit testing religion fun, but no guarantee of getting anything better, but a risk of new bugs because of the changes and increased code complexity.


You don't understand what for are tests. Tests don't give a shit if your code works today - you write them to freeze today's state of code (doesn't matter, if your code correct or not).


I do. My point is that not all code is of same kind. E.g. for the case you point, I do extensive unit testing for synchronous code with inputs and output/s that does significant stuff, in order to avoid breaking past things with new changes. You can check that I try to honor what I say, here: https://github.com/faragon/libsrt/blob/master/examples/stest...

However, there are cases where unit testing is not suitable, or it is not a guarantee, or it is an additional risk, e.g. event-driven or low level stuff, multithreaded code, etc.


Multithreaded and async code needs tests even more! It's much more difficult to test, but it doesn't mean such code shouldn't be tested. Where code is complicated, chances to create unintended changes are higher.


Sure. I was arguing against "religion" ("put unit tests everywhere, for every single thing"), not as "anti tests". There are many kind of tests.


"I get paid for code that works, not for maintainable code."

Ah, I get it. That explains the piece of s--- I'm looking at right now.

That said, the title might be sensationalist, but I agree with the holistic sentiment of the text.



Ha, good point.


There are so many unneeded tests being written I can't even begin to point them out. Here's an example: http://entulho.fiatjaf.alhur.es/notes/the-unit-test-bubble/

I've seen dozens of GitHub repos with a "tests/" directory that only contains tests for the constructor and ignores all the parts that should be tested. You don't need to test a constructor, this is stupid. If your constructor is not working none of the other tests will -- BUT HEY, your constructor is working, it is not hard to see it.


One benefit of testing is that it can highlight whether your abstractions make sense. If you need to pull in the world to test a small module then probably your dependencies are not right and what you thought was a unit turns out to be more than that.

When I am writing a module/function, I tend to continuously think of how this can be tested, which helps me design better abstractions.

For example if you're writing a class that uses a socket read/write, when testing you probably need to mock them. If you weren't planning on writing tests then probably you'd have ended by having the methods embedded in the class itself as read/write/close when those methods don't belong to the class and should be in another module called Socket that inherits a Socket interface. Now that you have a socket interface it becomes easier to test your class by passing a mock Socket interface.


The title quote is a bit out of context...

> I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it. I do tend to make sense of test errors, so I’m extra careful when I have logic with complicated conditionals. When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong.


Mixing up business and tech.

On the business side, you don't get paid for code at all. You get paid to make something people want. The fact that you're using programming to do that is inconsequential.

On the tech side, you're not delivering anything unless somebody, somewhere can test it, even if only one time.

So yes, you are getting paid for tests. In fact, that's the only thing you are getting paid for. The nub of the question is what the tests look like and how many you should have.


It's like a carpenter saying "I don't get paid for load testing table tops, I get paid for durable furniture."


Absolutely. There's some subtle wordplay going on here -- frankly it's probably done on purpose to draw out a lot of public discussion.


> You get paid to make something people want.

Depends. As a coder for hire sure. When I am a company employee though, I tend to think one level higher as a person who solves problems. There have been times that the business has thrown a problem on my desk thinking they need software, when all the business really needed was a process change.


The point is that there is a limit to testing. And some people go way overboard with it. You'll never get 100% coverage. It's simply not possible.

Now, that doesn't mean you should not test. It means you should understand the limits of unit testing and test what is important as best as you can. Most every software engineering class at universities will cover this in-depth.


Of course you get 100% coverage if that's your goal. SQLite for example is a big project with 100% coverage:

https://www.sqlite.org/testing.html

If you want rock solid software you need to spend the time to properly test it.



Great, you've proved that 100% test coverage doesn't mean 0% bugs. The actual question is whether there is a significant difference in reliability in software with 80% test coverage compared to software with 100% coverage.


Agreed. 100% line coverage doesn't mean that all functionality is tested; You don't test the full range of values every variable can take on, all possible throw-catch pairs etc. So the question is if the difference between 80% and 100% line coverage is that big. In fact, I think it can be decieveing to use the term '100%'. Arguably the reability of your program can be increased from 100% line coverage by adding an extra non-tested value check, effectively lowering the coverage.


Yup and also, how about missing code? Even 100% isn't enough :)


at which point we first need to talk about the type of coverage ... tree coverage or line coverage (or something else?)


In the case of SQLite, it's 100% branch coverage and 100% MC/DC coverage: https://www.sqlite.org/th3.html


Indeed there are diminishing returns as you increase coverage. I personally shoot to go over 80%. You may try to do 90% but it is going to significantly increase the development cost. You end up writing a whole unit test just to test one trivial line.


More precisely, there's a very strong power-law re testing; therefore the frequency of testing is also very important. You don't want to test just to create QC-Theatre, that's just narcissism. But most of us err on the side of laziness. The empirical data doesn't say Agile works, it says frequent testing works.


For me, the takeaway from that article is this:

> Different people will have different testing strategies based on this philosophy, but that seems reasonable to me given the immature state of understanding of how tests can best fit into the inner loop of coding. Ten or twenty years from now we’ll likely have a more universal theory of which tests to write, which tests not to write, and how to tell the difference. In the meantime, experimentation seems in order.

Indeed, we still "don't know" how to test—more generally, and given the abundance of methodologies and their tendency to go through a hype and dump cycle, I would say we still "don't know" how to write code in the first place.

We'll get there eventually, but for now I would take whichever approach, methodology, tools, and language that I use as having a "best before" date, and invest in it accordingly.


I don't write tests for all my code. But if I'm not writing an automated test (frequently when I'm writing a single-use script, which is something I do a lot, as I'm merely a hobbyist), I still "test" my code, function by function, at a REPL.

I've long since learned the hard way that if you don't test the functions as you write them, the bugs get buried in the system, and become very hard to find. When you test your code as you write it and modify it (formally on larger projects, informally on smaller ones) this doesn't happen.

That's the advantage of tests, so I can get Beck's point: If the function is so painfully simple that you already know if it's right, (say, an accessor) just by looking, then it's not worth writing a test for it.


Seems pretty simple, honestly. I've written enough unit tests to throw in my 2 bits.

Put a sane, normal value test. This will pass unless shit's broken.

Then test edge cases. Test min, max.

Then test some impossible values. If they correctly fail, you pass.


Or, you know, write property-based tests instead, so you only need to worry about the logic and not the test values.

I've always found that if you let the author of a piece of code decide on what value that code should be tested with, he'll test for the edge cases that he's thought of (and dealt with), not the ones that actually crash production.


I think both are important. Use property-based tests to fuzz for mistakes you might have made. Use hardcoded value tests to ensure that common future breakages of edge cases are caught, and to validate that your property-based tests are even doing anything.

Usually, property-based tests are more complex, which means that the chance of an logic error in the test increases, and having something easy to verify hardcoded brings peace of mind.


On the other hand, if you ask the author to think of the edge cases first, they're more likely to list them and then write code that handles them. Still no guarantee, but better than writing tests for code you just "finished".


But still, you end up with tests for what the author feels his code should handle, not necessarily what the real world is actually like.

Don't get me wrong, that's still valuable - if only for the non-regression aspect - but I feel property-based testing is a superior approach.

Write your code ("this is a function that sorts lists of ints"), write a property ("when given a list of ints, after sorting it, elements should be sorted and the list should have the same size"), let the framework generate test cases for you. Whenever something breaks ("lists of one element come back empty"), extract the test case and put it in a unit test to make sure the same issue doesn't crop up again in the future.


The key to success with that attitude:

>>I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris).

Is the humility in the parenthetical. There is a difference between arrogance and confidence.


Sounds like he'd be a team player and a great all-around guy to work with...

In reality, there's a considerable amount of respect given to people who are great at what they do, but are also humble and without the boated ego.


In other words "be smart, don't be stupid". Do you really need to write a test for that single expression setter?

But then again, it may be easier to just set a single round goal like 100% for test coverage. Writing that test for the single expression setter won't cost you a lot.


I can't remember the talk but one of the guys who really pushed agility in programming has even said that 100% test coverage should not be a goal. The talk he gave was more about testing the things that need testing and was a bit flabbergasted at the people who ran with the 100% coverage concept as part of it.


I've always viewed the code coverage metric as "what percentage of things that are supposed to be tested are actually tested?" From this perspective, anything less than 100% is a serious failing--it means that you've forgotten to test something you think is supposed to be tested.

Now, the correct response may well be to decide that that particular bit of code shouldn't be tested after all. Sometimes this means adding a /coverage skip if: an outgrabed momerath is hard to reproduce on demand/ for a particularly tricky corner case, but more often it means that you haven't correctly separated your concerns and that particular piece of code actually belongs in a file that's not touched by the test suite.


> Do you really need to write a test for that single expression setter?

I doubt anyone does that. Having weird requirements like "all public methods should have at least one test" is insane. For many code bases that would be an order of magnitude more tests than a requirement of "100% test coverage"


> Writing that test for the single expression setter won't cost you a lot.

Yeah, usually it won't cost anything to the guy who writes the test, but he is actually paid. What it costs to the business/customer/etc and whether the value added is worth of the cost is totally different matter.

I think there is clear incentive for writing unnecessary tests for certain developers - it is non-risky general work, where it is is difficult to fuck up anyway. If you are able to sell test-writing hours, what's the downside?


Difficult to even imagine a project that could fail business-wise because some developer wrote too many easy unit tests. Do you know of such cases?


I know cases where too much focus on tests might be reason for a startup failure. Eg. instead of trying to find the product-market fit, resources are spent building tests, and then funding runs out. Of course it is very difficult to argue whether test writing was the issue or something else, and people have different incentives related to that.


probably won't fail business-wise, but will cost incrementally more for each test business-wise.


Significantly?


Depends on the business case. I don't think this answer can be generalized either way.

You can lose a business because of the heavy costs which add little value, however it is not easily arguable that a specific cost was the deal-breaker.


Boy it would be great if writing code that added features could also be that safe and difficult to screw up.


When a junior dev tries to refactor said class and accidentally screws around with the behaviour of the setter function, having the tests blow up is handy.


If someone posted that question on SO now it would be insta-downvoted and then removed for being vague.


There are enough people in the industry who are actually paid for writing the tests and discovering the potential failures of the mission critical code, where the tests are fundamentally important.

I've had a small team nicely paid for months only to prove and document the that the product my company was to deliver won't fail in some specific scenarios, specified by the contract.

Those who don't produce the mission critical code (or believe what they produce is not on the critical path) unsurprisingly see the investment in the tests questionable. Of course, there is always a real danger of doing something "just because it is done" even if there's no real need.


I think possibly you are not quite understanding what Kent meant. Those people you mention are not getting paid for tests, they are being paid to give confidence that some software system works, they use tests to do this, just like Kent does. His point is about delivering working code with a certain level of confidence.

Meaning if tests don't increase your confidence, but simply put in to tick a box for having a test, you aren't getting paid for that (or if you are getting paid for that, someones lost sight of what they are trying to achieve)


In my specific case, that I've mentioned, the tests surely increased the confidence as they were the part of the whole process, where the results from the tests were used to modify the product in question until it was able to pass all the tests. The tests allowed to proactively "solve the bugs" which would otherwise produce much more problems if the product had been used in production without these tests.


I happen to work in a team. So I get paid to write code that works, that other devolopers can make sense of and hack on as well. That's why I write lots of tests.

If Kent Beck is coding in a private bubble, he can do whatever he wants that makes code work.


This is like saying "I'm paid for code that works, not proper syntax."

But proper syntax is what gets you code that works. You aren't paid directly for it, though. Tests are not a direct path to code that works, but they can be a big help.


The full quote is a bit more nuanced and captures this, but the key to this mantra is properly defining what "works" means.

Code that "works" doesn't just mean it runs/compiles/passes CI/etc. It has to continuously add value. It can do this by running properly and efficiently across a wide variety of likely or infrequent conditions, as well as some exceptional scenarios. It can do this by being written clearly and not adding technical debt. It can do this by being as simple and/or as replaceable as possible. And ultimately, it can even add a final gasp of value by being easily deletable.


And yet programmers wants to be held in the same regard as engineers.

https://www.youtube.com/watch?v=0WMWUP5ZHSY


I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it.

Which is unfortunately complete opposite of how TDD was interpreted, especially in its glory days (and in some corners, up until the present day).


This is why I advocate for end to end testing of a whole system's expected behavior to augment a small set of unit tests (the unit tests are for edge cases).

Nobody needs division by zero tests if there are already guards in place so that can't happen, but it's quite helpful to have a "A goes in, B should come out" view from a client/user perspective. As long as behavior appears correct to the client and is not exploitable you're good to go.


I contend that tests need cover functional and non-functional requirements. Everything else is to some degree optional.

Of course given no formal requirements, all that is possible are tests of the technical implementation. Regression errors will still be inevitable for customers and stakeholders, despite the "programmer" being able to claim his or her tests passed.

We need to find a way to stop kidding ourselves and find a way to test the right thing.


I'm not a big fan of this mentality. Writing enough tests to be bug free just isn't enough. Sure it's bug free today and that's great but will it be bug free tomorrow after a junior dev modifies it?

I'm not advocating testing getters/setters but not testing because "I don't write those kinds of bugs" can burn the next junior dev who might.

Testing is as much about finding bugs early as it is making your more agile in the future.


I submit that testing a bunch of trivial things isn't actually going to make it easier to catch bugs. I recommend this article: http://rbcs-us.com/documents/Why-Most-Unit-Testing-is-Waste....


You get paid to make the best technical decision for the company/project/who ever it is that's paying you. You get paid to communicate and understand when test are needed and when they are not. Startups will probably have a lot less test than bigger companies ; unless your a security startup or something that needs a solid foundation that you can trust.

Its all a trade off that needs to be communicated to whoever is paying you.


"Different people will have different testing strategies based on this philosophy, but that seems reasonable to me given the immature state of understanding of how tests can best fit into the inner loop of coding. "

The problem with comments like this is that they're too ambiguous. Someone who doesn't want to spend the time to write unit tests, will use this ambiguity as a mechanism for justifying their laziness.


You write tests to automate tests you would have to otherwise perform manually. That is the only reason tests exist, to automate the boring task of testing.

That's one problem with the TDD mindset. If you start by looking for things to test, you might come up with unlikely scenarios or cases that don't matter much for your user.


Or even worse you start writing code for "testability" and it becomes a bloated mess of one- or few-liner functions that are only called in one location by some other function.


"You get paid to write code, not tests" -My boss after telling me I should quit writing tests

I agree with Kent Beck mostly. I would add that tests can also be used to maintain invariants for future changes to increase maintainability. I just hope this quote isn't taken out of context.


It is a stupid, broad statement without proper context.

It really depends on what your project is, what your goals for maintainability are and what programming language you use.

Two things about testing: - test to confirm your spec - if you have trouble writing tests, your design is probably flawed


I first write good code. I then write unit tests to protect my code from:

a) Bad team mates. b) Future developers.

I've been burned one too many times with junior devs making cavalier changes in code they don't understand. Unit tests were THE solution for catching these changes.


Just an fyi about some nuances of TDD that are overlooked based on the 60+ comments I see so far.

Most comments seem to equate:

  "regression tests"=="TDD"
... but it's really...

  "regression tests" is subset of "TDD"
I'm not a practitioner of TDD but I my understanding of its components are:

1) the ergonomics & design of the API you're building by way of writing the tests first. In this sense, the buzzword acronym could have been EDD (Ergonomics Driven Development). Writing the usage of the API first to see how the interface feels to subsequent programmers. Arguably, a lot of incoherent/inconsistent APIs out there could have benefitted from a little TDD (e.g. func1(src, dst) doesn't match func2(dst, src))

2) a sort of specification of behavior by usage examples ... again by writing the tests first. Consider the case of programmers trying to figure out how an unfamiliar function actually works. Let's say a newbie Javascript programmer wants to know how to use .IndexOf()[1] What do many programmers do? They skip all the intro paragraphs and just hit PageDown repeatedly until they get to the section subtitled as "EXAMPLES". With TDD, instead of examples being relegated to code comments "sqrt(64) // should print 8" , it formally encodes the "should print 8" into real syntax that's understood by the automated test tools. (Test unit frameworks typically use the keyword "Expect()" as the syntax.)

3) an IDE that's "TDD aware" because it creates a quick visual feedback loop (the code that's "red" turns to "green") during initial editing. The TDD "artifacts" can also act as a "dashboard" for subsequent automated builds alerting you that something broke.

So TDD is a "workflow" and from that, you address 3 areas: (1) design (2) documentation (3) quality assurance via regression tests. With that background, the original Stackoverflow question makes more sense: how many "test cases" do I write because it looks like I can get bogged down in the test case phase?!?

[1]https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


O: my manager is the person who asked the StackOverflow question that prompted this.


Funny thing is, the most frequent positive result of tests for me hasn't even been mentioned, I don't believe. The biggest benefit I got was an ongoing education about how the program I was writing ACTUALLY functioned - enabling me to correct my assumptions before the shit hit the fan. This isn't quite the same as catching errors, since often you still want the algorithm you wrote as you wrote it, but knowing more about what's really going on gives you a heads up to avoid future problems, conflicts, etc. Of course, you may program differently. I was always big on asserts back in the day, and two-thirds of my debugging (by instance not hours) was spent fixing asserts, and thereby learning that some assumption I was making about the program, was wrong at least some of the time. Always good to know.


Tests might help you write better code.To the extent they do, you should use them.

That's like saying "I get paid for functioning software, not writing code."

Yes, you get paid for the output of the act, not the act itself.


Beck elaborates on this point of view in the current issue of Java Magazine: http://bit.ly/2g6YEo2 (loads slowly)


>Indeed, since this answer, 5years ago, some big improvements have been made, but it’s still a great view from a inspiring person

Such as?


Writing test is like investing. You have to pick the tests that return the most reward. Simply firing shots is wasteful.


I've referenced this stack post a few times too. I feel like it might be easy to take this out of context.


you will know when and why to write tests ... The same bug keeps coming up, you spend most of your time manually testing, you are not sure this change will break anything, or you are too scared to touch the code.


Interesting comment although I don't feel like the article adds much to it.


That's why you need to pay another guy to find code that breaks :)


And then, because the tester gets paid to write test, not to understand the product, you will operate with huge blindspots.

[Edit] I erased a cheap shot I took at Ken Beck. I still think the title of the article is stupid, but what the guy actually said is much more nuanced than that.


Why not get paid for both?


The start of the comment is really out of context here. What he wrote about the team case (what most of us are payed for) is this:

When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong.


It is a begging comment and not clear what is said beyond "create some automated tests until you feel good"

Allow me to explain:

Production use of code IS testing (manual, etc). Because it is an observation of the system state.

All of the world is testing. Every system is inherently a quantum mechanical one (ie: the observer is constantly testing the state of various systems to ascertain some level of confidence)

If you are going to test anything... then you should test the Use Cases (ie: Interactor objects). Don't have Use Case/Interactor objects that encapsulate intent? Well, you better understand it since the world, and therefore software, is all about intent.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: