Test coverage is near worthless as a metric. I'm trying to write programs with countless possible states consisting of thousands of attributes, and gain high confidence my program performs correctly in every single one. Coverage states that I should focus on a single one: the program counter. I don't know about you, but there's a lot more to my program that the address of the instruction currently executing. If I keep a list of the streets I've driven on, that might give me a list of new places to visit, but it's not going to tell me how well my car's working.
So, if not coverage, what should we use? I like mutation analysis. How do you know if your image recognition algorithm works? You run it on new images. How do you tell if your tests are catching bugs? You add bugs and see if it catches them. It's simpler than coverage in some ways -- you need no instrumentation.
And yet somehow, every test infrastructure can measure coverage, with mutation analysis nowhere to be found. We have a huge literature on testing (mutation analysis is over 40 (!) years old), and yet developers simply choose to ignore it.
So, if not coverage, what should we use? I like mutation analysis. How do you know if your image recognition algorithm works? You run it on new images. How do you tell if your tests are catching bugs? You add bugs and see if it catches them. It's simpler than coverage in some ways -- you need no instrumentation.
And yet somehow, every test infrastructure can measure coverage, with mutation analysis nowhere to be found. We have a huge literature on testing (mutation analysis is over 40 (!) years old), and yet developers simply choose to ignore it.