Software testing practices are still pretty controversial.
I spent about a week redesigning the interfaces of a security-sensitive system to make it testable and developing a test harness, the actual implementation was about a day of work. I felt it was a great use of time.
I have not found an answer for testing React components that seems like a good use of resources. I've tried numerous test libraries from Enzyme to react-testing-library and found them all to be terribly flawed (sorry, if you think "I shouldn't have to modify my code to write tests" and then tell me to "use data-testids" that's the end of the conversation. I have an hour to write a test, I don't have two weeks to diagnose how a third party library I use interacts with the Rube Goldberg machine react-testing-library uses to emulate ARIA support)
Code that is basically "functional" and doesn't interact with a large mysterious system is straightforward to test; a lot of UI code has the issue that it not only calls a mysterious large and complex system but it also gets called by a mysterious large and complex system. I'd really love to have a system that would give me a heads up that some CSS I changed has a spooky effect on a distance on part of the UI because of the complexity of selectors but that's asking a lot
makes the strong test that tests have to be really fast because you should have hundreds or thousands of them that run any build. Feathers makes the best intellectual case for testing when testing is difficult that I've ever seen, but some of the tests I'm proudest of could never run quickly: I built a "Super Hammer" test that ran a race condition prone system with 1000 threads for about 40 seconds that wouldn't have been effective if it was dramatically shorter. You might do 10 builds a day so you could easily do 2500 builds a year and that adds almost 28 hours of waiting a year -- that's just one test but it costs $2800 a year for a fully loaded FTE who costs $100 an hour and probably costs more than that because the long build sets idle hands in motion doing the devil's work and probably wastes several times more time than that.
Wow. Ignoring the proximate cause of leaving that code in there, imagine being the poor schmuck that forgets to deploy a code update to 1 of 8 servers, causing $440 million in damages, basically destroying a company overnight. It's so far outside of comprehension at that scale.
The reason is that most potential customers will come back later, and that's assuming that the system that went down even had any real money attached to it.
I see a lot of multi-region, multi-zone architectures deployed for government websites that get maybe a couple of unique users per day. These could be down for many hours and chances are that no human would notice.
> I have not found an answer for testing React components that seems like a good use of resources
> …
> I'd really love to have a system that would give me a heads up that some CSS I changed has a spooky effect on a distance on part of the UI because of the complexity of selectors but that's asking a lot
I would’ve agreed until recently. I always found basically all other forms of testing valuable (unit tests of almost everything on the BE, unit tests of FE business logic, BE integration tests, E2E tests), but not testing of the visual elements of the FE.
But the company I work at, ~6 months ago we gave this product a try, and honestly it’s pretty incredible: https://www.meticulous.ai/
They basically record sessions of real usage in our staging environment, and then replay them against your branch, like taking all the same interactions, and mocking responses to all network calls. It records tonnes of these sessions and is very smart about which ones it uses for a given change. It flags any visual differences, and you can OK them (or not). There’s a bit of work to initially integrate, but then you don’t write any tests, and you get pretty amazing coverage. It has the odd false positives, but not many, and they’re easy to review/approve in their web UI. They’re also a small startup willing to work super closely with you (we share a Slack chat with them, they’re very open to feedback and iterating quickly on it).
I swear I’m not a paid shill or affiliated with them in any way, just a user who really loves the product. I was skeptical it’d work well at first, but it’s honestly been great, has caught many potential regressions, I feel we’re getting much better coverage than we would with handwritten UI tests. It’s very worth a look IMO if you’re not satisfied with your visual tests. It’s not an E2E testing tool, because the network requests are recorded/replayed (so it can’t stop BE changes that break the FE), but it’s amazing for testing so many elements of the FE.
Hm. Just thinking out loud. TLDR is that I think the core of the solution to good testing of React components, would look like using it like the Web Component model?
So one thing that I keep coming back to as kind of a "baseline" is called Functional Core, Imperative Shell. It's a little hard to explain in a short space like this and the presentations available on YouTube and blog articles are a bit confusing, but basically it asks for your application to be broken into three parts: (1) modules in the "functional core" define immutable data structures and purely deterministic transformations between them; (2) modules in imperative libraries each "do one thing and do it well", they might call the SaveUser API or something like that; and (3) these are held together by a thin shell of glue code that needs to have no real logic (that's for the functional core) and needs to not do any real operations directly (that's for the libraries). And the reason to break the app apart like this is that it's test-centric: modules in (1) are unit-tested without mocks, by creating immutable data structures, feeding to the transforms, and checking the output; modules in (2) will init a real connection to a dev server upstream and are integration-tested by sending the real request to the real server; and the shell in (3) is basically so simple that it requires one end-to-end test that just makes sure that everything compiled together all right and can initialize at runtime. So this is what software looks like if you elevate Testing to be the One Core Pillar of the application, and demand "no mocks!" as part of that.
Q1, how do we apply this to a React app? Well, if you think in terms of UI, you can kind of think of an entire view as being kind of (3) as long as you aggressively oversimplify the behaviors: so you click on some part of some view and it dispatches some ButtonClicked data structure into some view-wide event queue, but it does no logic of its own. Reminds me of Redux, also reminds me a lot of Lit and web components where they just kind of emit CustomEvents but aren't supposed to do anything themselves.
Q2, how do components fit in. We have to be a little more careful there. You're talking about a modular architecture though, real thick components. Componentizing, takes us a step back from that ideal, right? It says "I don't want this to look like a single unified whole that is all tested at once, I want this to look like a composition of subsystems that are reusable and tested independently."
A simple example might be a tabbed view or accordion control, I want to coax the viewer through these N different steps, the previous step needs to be complete and then you can go to the next one. And I want the components to be each of these views. (The actual tab view or accordion view is of course another component, but it's a "thin component" in the above sense, it doesn't actually have any imperative library and the logic is relatively trivial, it doesn't generate the sorts of questions you're asking about.)
So just to roll up a random mental example, one of these tabs is some PermissionsEditor component, once you initialize it, it has everything it needs inside the component to fetch permissions from the API, fetch the current user, see what permissions the current user is allowed to grant to other users (or themselves?)... but the other tabs need to be dynamically responsive, once you add yourself to the group that can edit Flotsam and Jetsam, going to the Flotsam tab the "Edit" button should no longer be grayed out etc.
Then I think the proper way to view these thicker components, is as being inserted at level (2) into the main application? So the main application just treats them as imperative libraries, "I will give you a div and call your init_permissions_editor function with that div and you render into there. I will give you a channel to communicate events to me on. You will provide me the defs of the immutable data structures you'll send down that channel, I will provide deterministic transformations of those events into other events that I need to do."
With some caveats, yeah, I'd basically say this is the web-component model. Your external application just integration-tests that init_permissions_editor will render _something_ into a blank <div> given. Your PermissionsEditor component is responsible for integration-testing that it can create permission, add user to group, all of that, and is responsible for testing that it emits certain events when these things happen.
I spent about a week redesigning the interfaces of a security-sensitive system to make it testable and developing a test harness, the actual implementation was about a day of work. I felt it was a great use of time.
I have not found an answer for testing React components that seems like a good use of resources. I've tried numerous test libraries from Enzyme to react-testing-library and found them all to be terribly flawed (sorry, if you think "I shouldn't have to modify my code to write tests" and then tell me to "use data-testids" that's the end of the conversation. I have an hour to write a test, I don't have two weeks to diagnose how a third party library I use interacts with the Rube Goldberg machine react-testing-library uses to emulate ARIA support)
Code that is basically "functional" and doesn't interact with a large mysterious system is straightforward to test; a lot of UI code has the issue that it not only calls a mysterious large and complex system but it also gets called by a mysterious large and complex system. I'd really love to have a system that would give me a heads up that some CSS I changed has a spooky effect on a distance on part of the UI because of the complexity of selectors but that's asking a lot
https://understandlegacycode.com/blog/key-points-of-working-...
makes the strong test that tests have to be really fast because you should have hundreds or thousands of them that run any build. Feathers makes the best intellectual case for testing when testing is difficult that I've ever seen, but some of the tests I'm proudest of could never run quickly: I built a "Super Hammer" test that ran a race condition prone system with 1000 threads for about 40 seconds that wouldn't have been effective if it was dramatically shorter. You might do 10 builds a day so you could easily do 2500 builds a year and that adds almost 28 hours of waiting a year -- that's just one test but it costs $2800 a year for a fully loaded FTE who costs $100 an hour and probably costs more than that because the long build sets idle hands in motion doing the devil's work and probably wastes several times more time than that.