I'm not sure. Yes, we can (to some degree) independently test parts, but like I said, each part requires a significant portion of the environment to be up (or simulated). And "unit" testing (that is, testing an individual routine or module) doesn't really make sense given how the code is written (receive a message via SS7---the network stack I mentioned) and convert it to an IP based message). To test the portion that talks to the telephony network requires a telephony network (very hard to mock out---lord knows I would love to) and another major unit we wrote (which is another part I test) to even be testable.
And to test that other part? Well, it requires I mock out the previous unit (or run it), plus three three other parts (one including a cell phone---which is really a simple script at this point). And again, it doesn't really make sense to test individual routines because this takes the translated IP packets from the SS7 module, and makes several queries to other IP based services. So a lot of what's going on is just simple translations (in a multithreaded/multiprocessor environment---more fun!).
I went to an Agile class where the lecturer compared unit tests to double-entry bookkeeping. An accountant doesn't say "oh, I don't need to add up both columns here, I know it's just trivial addition".
Once I got in the habit of writing tests of even the most simple transformations, the code complexity and my test complexity grew at the same rate, so it's much harder to end up with a giant untestable mass.
I once spent over a month tracking down a bug (in a different project than the one I mentioned above) that I have a hard time seeing how unit testing would have caught. The program: a simple process (no threads, no multiprocessing) that would, depending on which system it ran, would crash with a seg fault. The resulting core files were useless as each crash was in a different location.
It turned out I was calling a non-re-entrant function (indirectly) in a signal handler (so technically it was multithreaded) and the crash really depended on one function being interrupted at just the right location by the right signal. That's why it took a month of staring at the code before I found the issue. Individually, every function worked exactly as designed. Together, they all worked except in one odd-ball edge case that varied from system to system (on my development system, the program could run for days before a crash; on the production system it would crash after a few hours). The fix was straightforward once the bug was found, but finding it was a pain.
So please, I would love to know how unit tests would have helped find that bug. Yes, it is possible to write code to hopefully trigger that situation (run the process---run another process that continuously sends signals the program handles) but how long do I run the test for? How do I know it passed?
no, unit testing doesn't tell you if your constructs aren't safely composable. So: it will pretty much never find a threading bug, a concurrency bug, a reentrancy bug, etc.
I only know three ways to detect this sort of bug, and they all suck: 1) get smart people to stare at all of your code 2) brute force as many combinations as possible 3) move the problem into the type system of your language so you can do static analysis of the code
I don't think even the most hardcore TDD zealots would come anywhere close to claiming that testing is a silver bullet. There will always be cases where you didn't think of a particular edge case, or when some environment-based issue makes covering something in a test impossible. That doesn't negate it's benefits in preventing the 99% percent of bugs that aren't an insanely rare edge case.
I don't think you should expect every bug to be caught by unit testing. But where it helps with a problem like that is eliminating a lot of other possible causes of bugs. Debugging something like this is often a needle-in-a-haystack problem, but it's nice if you can rule out most of the hay from the beginning.
In this case, once I discovered the cause of the bug I would have written a unit test that exposed it, probably a very focused one. Then I would have gone hunting for other missed opportunities to test for this, and I imagine my team would have come up with some sort of general rule for testing signal handlers.
Heh, I wonder if we have the same SS7-stack. Your description sounds disturbingly similar to our experience. Do your stack also have a wait of several minutes before reporting that starting it up went ok? (to the logfile of course. It is to good to actually report that to the console or to have service scripts that can be trusted). That wait is very popular with our testers.
Oh well, at least in our case our signals originate in IP and we only have to check against a HLR, which do have a semidecent mockup.
From what I understand, there are only two commercially available SS7 stacks, and the one we use is the better of the two (which I find a frightening thought). So there's a 50/50 chance. I don't know enough of the stack to start it (or restart it) so I can't say for sure if that's how our stack works.
I don't agree. I obviously don't know the specifics of your project and I certainly don't always unit test code either (even though I know better - though I do unit test actual important code, just not my own experimental or prototype code), but your comment sounds to me like your trying to rationalize not testing your code (or you are frustrated by the amount of third party code thats making it hard to test..). Maybe it would be too expensive to test...
receive a message via SS7 and convert it to an IP based message. To test the portion that talks to the telephony network requires a telephony network
I worked on an SMS anti spam/fraud system for a few years and we unit tested and simulated everything.
For unit testing we mocked all the network/hardware stuff so that each part of our code could be tested in isolation. I firmly believe that there is no code which cannot be unit tested[1], though obviously some code is easier to unit test than other code.
For more end-to-end simulation, we wrote a test suite that would simulate the SS7 network and allow us to test our system under all kinds of message flows - testing not just that the system worked for each variant of the message flows, but also stress testing and performance testing our system. It worked with raw SS7 messages received from a number of commercial gateways and also with SIGTRAN messages (which are almost the same thing anyway). This worked pretty well for us.
just simple translations
That should be the easiest type of code to test! Pure functional translation is ideal for testing: if I put in X, I expect to get Y back (for a bunch of X/Y pairs).
You mention multiple machines and multithreading - obviously this makes testing pretty damn hard (though unit testing should generally not be too affected), but possibly also more critical since multiprocessing is hard anyway. Anyway, like I said, I don't know your system.
most of the "units" being tested require almost as much set up as the entire "program"
It sounds to me that the design isn't modular enough (by design or by evolution), or the units are much much too large. Each unit should be fairly simple and reasonably self-contained.
[1] Nowadays I do some embedded systems stuff, which at first I considered really hard to unit test, but changed my mind after reading this book: http://pragprog.com/book/jgade/test-driven-development-for-e... If you can abstract away microcontrollers and other hardware for the purpose of testing in an embedded scenario, you can abstract pretty much anything away.