I have tried to evangelize unit testing at each company I've worked at and most engineers struggle with two things.
The first is getting over the hurdle of trusting that a unit test is good enough, a lot of them only trust an end-to-end test which are usually very brittle.
The second reason is, I think, a lot of them don't know how to systematically breakdown test into pieces to validate e.g. I'll do a test for null, then a separate test for something else _assuming_ not null because I've already written a test for that.
The best way I've been able to get buy-in for unit testing is giving a crash course on a new structure that has a test suite per function under test. This allows for a much lower loc per test that's much easier to understand.
When they're ready I'll give tips on how to get the most of their tests with things like, boundary value analysis, better mocking, IoC for things like date time, etc.
I've evangelized against unit testing at most companies I work at, except in one specific circumstance. That circumstance is complex logic in stateless code behind a stable API where unit testing is fine. I find this usually represents between 5-30% of most code bases.
The idea that unit testing should be the default go to test I find to be horrifying.
I find that unit test believers struggle with the following:
1) The idea that test realism might actually matter more than test speed.
2) The idea that if the code is "hard to unit test" that it is not necessarily better for the code to adapt to the unit test. In general it's less risky to adapt the test to the code than it is the code to the test (i.e. by introducing DI). It seems to be tied up with some sort of idea that unit testability/DI just makes code inherently better.
3) The idea that integration tests are naturally flaky. They're not. Flakiness is caused by inadequate control over the environment and/or non-deterministic code. Both are fixable if you have the engineering chops.
4) The idea that test distributions should conform to arbitrary shapes for reasons that are more about "because google considered integration tests to be naturally flaky".
5) Dogma (e.g. uncle bob or rainsberger's advice) vs. the idea that tests are investment that should pay dividends and to design them according to the projected investment payoff rather than to fit some kind of "ideal".
> The idea that unit testing should be the default go to test I find to be horrifying.
Kent Beck, who invented the term unit test, was quite clear that a unit test is a test that exists independent of other tests. In practice, this means that a unit test won't break other tests.
I am not sure why you would want anything other than unit tests? Surely everyone agrees that one test being able to break another test is a bad practice that will turn your life into a nightmare?
I expect we find all of these nonsensical definitions for unit testing appearing these days because nobody is writing anything other than unit tests anymore, and therefore the term has lost all meaning. Maybe it's simply time to just drop it from our lexicon instead of desperately grasping at straws to redefine it?
> It seems to be tied up with some sort of idea that unit testability/DI just makes code inherently better.
DI does not make testing or code better if used without purpose (and will probably make it worse), but in my experience when a test will genuinely benefit from DI, so too will the actual code down the line as requirements change. Testing can be a pretty good place for you to discover where it is likely that DI will be beneficial to your codebase.
> The idea that test realism might actually matter more than test speed.
Beck has also been abundantly clear that unit tests should not resort to mocking, or similar, to the greatest extent that is reasonable (testing for a case of hardware failure might be place to simulate a failure condition rather than actually damaging your hardware). "Realism" is inherit to unit tests. Whatever it is you are talking about, it is certainly not unit testing.
It seems it isn't anything... other than yet another contrived attempt to try and find new life for the term that really should just go out to pasture. It served its purpose of rallying developers around the idea of individual tests being independent of each other – something that wasn't always a given. But I think we're all on the same page now.
> Kent Beck, who invented the term unit test, was quite clear that a unit test is a test that exists independent of other tests
Kent Beck didn't invent the term "unit test", it's been used since the 70's (at minimum).
> I am not sure why you would want anything other than unit tests?
The reason is to produce higher quality code than if you rely on unit tests only. Generally, unit tests catch a minority of bugs, other tests like end to end testing help catch the remainder.
> other tests like end to end testing help catch the remainder.
End-to-end tests are unit tests, generally speaking. Something end-to-end can be captured within a unit. The divide you are trying to invent doesn't exist, and, frankly, is nonsensical.
> End-to-end tests are unit tests, generally speaking.
Generally, in the software industry, those terms are not considered the same thing, they are at opposite ends of a spectrum. Unit tests are testing more isolated/individual functionality while the end to end test is testing an entire business flow.
Here's an example of one end to end test (with validations happening at each step):
1-System A sends Inventory availability to system B
2-The purchasing dept enters a PO into system B
3-System B sends the PO to system A
4-System A assigns the PO to a Distribution Center for fulfillment
5-System A fulfills the order
6-System A sends the ASN and Invoice to system B
7-System B users process the PO receipt
8-System B users perform three way match on PO, Receipt and Invoice documents
Bad example, perhaps, but that's also a unit test[1]. Step 8 is dependent on the state of step 1, and everything else in between, so it cannot be reduced any further (at last not without doing stupid things). That is your minimum viable unit; the individual, isolated functionality.
[1] At least so long as you don't do something that couples it with other tests, like modifying a shared database in a way that that will leave another test in an unpredictable state. But I think we have all come to agree that you should never do that – going back to the reality that the term unit test serves no purpose anymore. For all intents and purposes, all tests now written are unit tests.
Every step updates shared databases (frequently plural). In the case of the fulfillment step, the following systems+databases were involved: ERP, WMS, Shipping.
Typically, in end to end testing, tests are run within the same shared QA system and are semi-isolated based on choice of specific data (e.g. customers, products, orders, vendors, etc.). If this test causes a different test to fail, or vice-versa, then you have found a bug.
If we call that entire sequence of steps a "unit" test, would you start with testing the entire sequence of steps, or would you recommend testing the individual steps first?
And if we did test the individual steps first, we would give that testing a different name? Like maybe "sub-unit" testing?
> Every step updates shared databases (frequently plural).
That's fine. It all happens within a single unit. A unit should mutate shared state within the unit. Testing would be pretty much useless without.
> If we call that entire sequence of steps a "unit" test, would you start with testing the entire sequence of steps, or would you recommend testing the individual steps first?
For all intents and purposes, you can't test the individual steps. All subsequent steps are dependent on the change in inventory state in step 1. And the product of step one is undoubtedly internal state, so there is no way for the test to observe the state change in isolation (unless you do something stupid). You have to carry out the subsequent steps to be able to infer that the inventory was, in fact, updated appropriately.
After all, the whole reason you are testing those steps together is because you recognize that they represent a single instance of functionality. You don't really get to choose (unless you choose to do something stupid, I suppose).
> And if we did test the individual steps first, we would give that testing a different name?
If the individual steps can be tested individually (ignoring a case of you doing something stupid), it's not actually and end-to-end process, so your example would make no sense. Granted, we have already questioned if it is a bad example.
> For all intents and purposes, you can't test the individual steps.
Sure you can, and we did (that is a real example of an end to end test from a recent project) which also included testing the individual steps in isolation, which was preceded by testing the individual sub-steps/components of each step (which is the portion that is typically considered unit testing).
For example, step 1 is broken down into the following sub-steps which are all tested in isolation before testing the combined group together:
1.1-Calculate the current on hand inventory from all locations for all products
1.2-Calculate the current in transit inventory for all locations for all products
1.3-Calculate the current open inventory reservations by business partner and products
1.4-Calculate the current in process fulfillments by business partner and product
1.5-Resolve the configurable inventory feed rules for each business partner and product (or product group)
1.6-Using the data in 1.1 through 1.5, resolve the final available qty for each business partner and product
1.7-Construct system specific messages for each system and/or business partner (in some cases it's a one to one between business partner and system, but in other cases one system manages many business partners).
1.7.1-Send to system B
1.7.2-Send to system C
1.7.3-Send to system D
1.7.N-etc.
> And the product of step one is undoubtedly internal state, so there is no way for the test to observe the state change in isolation
The result of step 1 is that over in software system B (an entirely different application from system A) the inventory availability for each product from system A is properly represented in the system. Meaning queries, inquiries, reports, application functions (e.g. Inventory Availability by Partner), etc. all present the proper quantities.
To validate this step, it can be handled one of two ways:
1-Some sort of automated query that extracts data from system B and compares to the intended state from step 1 (probably by saving that data at the end of that step).
or
2-A user manually logs in to system B and compares to the expected values from step 1 (again saved or exposed in some way). This method works when the number of products is purposefully kept to a small number for testing purposes.
> If the individual steps can be tested individually (ignoring a case of you doing something stupid), it's not actually and end-to-end process, so your example would make no sense. Granted, we have already questioned if it is a bad example.
Yes the individual test can be tested in individually. Yes it is an end to end test.
> Granted, we have already questioned if it is a bad example.
It's a real example from a real project and it aligns with the general notion of an end to end test used in the industry.
More importantly, combined with the unit tests, functional tests, integration tests, performance tests, other end to end tests and finally user acceptance tests, it contributed to a successful go-live with very few bugs or design issues.
I dont know many people who would describe a test that uses playwright and hits a database as a unit test just because it is self contained. If Kent Beck does then he has a highly personalized definition of the term that conflicts with its common usage.
The most common usage is, I think, an xUnit style test which interacts with an app's code API and mocks out, at a minimum, interactions with systems external to the app under test (e.g. database, API calls).
He may have coined the term but that does not mean he owns it. If I were him Id pick a different name for his idiosyncratic meaning than unit test - one that isnt overburdened with too much baggage already.
> He may have coined the term but that does not mean he owns it.
Certainly not, but there is no redefinition that is anything more than gobbledygook. Look at the very definition you gave: That's not a unique or different way to write tests. It's not even a testing pattern in concept. That's just programming in general. It is not, for example, unusual for you to use an alternative database implementation (e.g. an in-memory database) during development where it is a suitable technical solution to a technical problem, even outside of an automated test environment. To frame it as some special unique kind of test is nonsensical.
If we can find a useful definition, by all means, but otherwise what's the point? There is no reason to desperately try to save it with meaningless words just because it is catchy.
The definition I gave is the one people use. Hate or love it youre not going to change it to encompass end to end tests and neither will Kent Beck. It's too embedded.
I might. I once called attention to the once prevailing definition of "microservices" also not saying anything. At the time I was treated like I had two heads, but sure enough now I see a sizeable portion (not all, yet...) of developers using the updated definition I suggested that actually communicates something. Word gets around.
Granted, in that case there was a better definition for people to latch onto. In this case, I see no use for the term 'unit test' at all. Practically speaking, all tests people write today are unit tests. 'Unit' adds no additional information that isn't already implied in 'test' alone and I cannot find anything within the realm of testing that needs additional differentiation not already captured by another term.
If nothing changes, so what? I couldn't care less about what someone else thinks. Calling attention to people parroting terms that are meaningless is entirely for my own amusement, not some bizarre effort to try and change someone else. That would be plain weird.
Well, I don't regard unit tests as the one true way. I don't enforce people on my team do it my way. When I get compliments on my work, I tend to elaborate and spread my approach. That's what I mean by evangelize, not necessarily advocating for a specific criteria to be met.
I find that integration tests are usually are flaky, its my personal experience. In fact, at my company, we just decided to completely turn them off because they fail for many reasons and the usual fix is to adjust the test. If you have had a lot of success with them, great. Just for the record, I am not anti-integration or end-to-end test. I think they have a place and just like unit tests shouldn't be the default, neither should they.
Here are the two most common scenarios where I find integration (usually end-to-end called integration) tests become flaky:
1) DateTime, some part of business logic relies on the current date or time and it wasn't accounted for.
2) Data changes, got deleted, it expired, etc. and the test did not first create everything it needed before running the test.
Regarding your points,
1) "realism" that is what I referred to as trusting that a unit test is good enough. If it didn't go all the way to the database and back did it test your system? In my personal work, I find that pulling the data from a database and supplying it with a mock are the same thing. So it's not only real enough for me, but better because I can simulate all kinds of scenarios that wouldn't be possible in true end-to-end tests.
2) These days the only code that's hard to test is from people that are strictly enforcing OOP. Just like any approach in programming, it will have it's pros and cons. I rarely go down that route, so testing isn't usually difficult for me.
3) It's just been my personal experience. Like I said, I'm not anti-integration tests, but I don't write very many of them.
4) I didn't refer to google, just my personal industry experience.
5) Enforcing ideal is a waste of time in programming. People only care about what they see when it ships. I just ship better quality code when I unit test my business logic. Some engineers benefit from it, some harm themselves in confusion, not much I can do about it.
Most of this is my personal experience, no knock against anyone and I don't force my ideals on anybody. I happily share what and why things work for me. I gradually introduce my own learning over time as I am asked questions and don't seek to enforce anything.
> I'll do a test for null, then a separate test for something else _assuming_ not null because I've already written a test for that.
Honestly, this pedantry around "unit tests must only test one thing" is counter-productive. Just test as many things as you can at once; it's fine. Most tests should not be failing. Yes, it's slightly less annoying to get 2 failed tests instead of 1 fail that you fix and then another fail from that same test. But it's way more annoying to have to duplicate entire test setups to have one that checks null and another that checks even numbers and another that checks odd numbers and another that checks near-overflow numbers, etc. The latter will result in people resting writing unit tests at all, which is exactly what you've found.
If people are resisting writing unit tests, make writing unit tests easier. Those silly rules do the opposite.
Just to clarify, I am not advocating for tests to only test one thing, rather that after you have tested for one scenario you don't need to rehash it again in another test.
Breaking a test down helps to clarify what you're testing and helps to prevent 80 loc unit tests. When I test for multiple things, I look for the equivalent of nunit's assert.multiple in the language that I'm in.
The approach I advocate for typically simplifies testing multiple scenarios with clear objectives and tends to make it easier when it comes time to refactor/fix/or just delete a no longer needed unit test. The difference I find, is that now you know why, vs having to figure out why.
The first is getting over the hurdle of trusting that a unit test is good enough, a lot of them only trust an end-to-end test which are usually very brittle.
The second reason is, I think, a lot of them don't know how to systematically breakdown test into pieces to validate e.g. I'll do a test for null, then a separate test for something else _assuming_ not null because I've already written a test for that.
The best way I've been able to get buy-in for unit testing is giving a crash course on a new structure that has a test suite per function under test. This allows for a much lower loc per test that's much easier to understand.
When they're ready I'll give tips on how to get the most of their tests with things like, boundary value analysis, better mocking, IoC for things like date time, etc.