Where TDD Fails

astangl · on March 6, 2013

I've seen a lot of silver bullet fads come & go. I'll be glad to see the TDD fad go. Testing sure is a useful tool in the toolbox, but that's it.

It's funny this article quotes the asymptotic sort behavior, because that's almost identical to a real-world example I have used to debunk over-reliance on unit tests (AKA "false sense of security"). Back in the days when we still wrote our own sort functions, we were working on various QuickSort implementations. A popular optimization is to end with an insertion sort after a certain point. So your QuickSort can be completely broken, and the insertion sort at the end will still clean up the mess. Unit tests will typically just validate that the list is sorted correctly and completely miss the fact that the QuickSort itself is broken internally. Nobody has seen fit to comment on this example, so far.

Another fad I'd like to see gone is "no comments". My mind still boggles from a recent code review where I had to defend using comments. Sure, we've all seen worthless comments, and misleading comments, but let's not throw the baby out with the bathwater and ban comments altogether! Self-documenting code is an ideal that should be aimed for whenever possible. But there's still a place for block comments in certain areas to explain key design and implementation decisions, tell a future maintainer what they might want to consider before changing or enhancing something, etc. It's not always possible to capture all that in code.

VikingCoder · on March 6, 2013

If you want it to work, you do 100% code reviews.

Many people think they want it to work, but they don't want to put in the effort to make sure it works.

When you do a code review, you demand what you think are reasonable tests.

If the organization you work for supports your sincere belief that the code is fucked, and needs re-working... or the testing framework is fucked, and needs re-working... or the guy typing shitty code is fucked, and needs to not be working... then over time, your code will tend to get better.

As long as you've got the relevant skills in the room. A sufficiently difficult problem is not likely to be coded correctly by someone who lacks the requisite skills and experience.

It pretty much comes down to "invest in the code, or exploit the code." Many managers think they're investing in the code, just because they have coders adding features. That's like saying you know you're fixing potholes, because you see people driving on the road.

NOTE: doing code reviews does not guarantee it will work. But over time, skipping code reviews guarantees it will eventually not work.

fadzlan · on March 6, 2013

It never was a silver bullet to begin with. There is no way that unit testing could cover every ground. Granted, for what its worth, it is something that is reasonably cheap that you could run over and over again cheaply. Functional testing can cover most of the cases, but its more costly, requiring a full system setup, takes longer time to run and more expensive to write.

Manual testing seems straightforward, but its not repeatable, hence more expensive, since it requires warm bodies to run it every single time.

Given the effectiveness and the cost of each approach, it makes more sense to have different coverage for different test. For something that is more expensive, you would choose to create and run it for the most important part of the system. For something that is dirt cheap, well, why not just run it everywhere?

Consider regression test. It covers every part of the system. But its not possible to run it every single time a developer change something. The cost is too prohibitive to run it at every changes, even with automation, much less using the manual approach. What about unit test? Dirt cheap. You can run it every 15 minutes or so if you are inclined and it would still be okay.

I would say that it is wrong to start a software development exercise and just religiously say we have use TDD. The right question to ask, is what is your of degree quality and what is the cost that you are willing to pay.

For some people, hey, lets just hire rock star developers for everything, but some projects cannot afford that. And if can only afford a team that barely knows what unit testing is, there is not much point of training everyone for a short project (say, 3 months with 1 month warranty).

And yeah, there is limited benefit with unit testing (as is with everything), but still I don't think its fair to throw the baby into the bathwater too soon either. Do what make sense and do what works.

prophetjohn · on March 6, 2013

    |A popular [quick sort] optimization is to end with an insertion sort

What's so hard to test about this? If your quick sort code and your insertion sort code aren't all shoved into one function, you just test them separately.

kamaal · on March 6, 2013

Unit tests verify if what you have written is correct. They don't verify if you are writing the right thing.

Writing unit tests for the wrong thing, only tests the wrong thing.

kamaal · on March 6, 2013

TDD always seemed like a impossible thing to a developer like me.

The problem is something like this, the moment I face a programming task- If the task is a little complex I do some work on the notebook else I directly fire up the editor and write some code. Now unless I get some code/prototype up and running I can't proceed further. That is how my mind works.

I don't know how anyone can write tests about the idea/task for which they haven't written a line of code. I've found it impossible. For me, its always:

    step 0. Basic version of Idea.
    step 1. Write code.
    step 2. Making it nice(optimize/document/clean up).
    step 3. Make it run faster.
    step 4. extend idea and repeat steps 1 to 4.

I'm jealous of all those people who can have a 'step -1' where they can write test cases 'for the code' they would end up with in 'step 4', even while they have a very blurry version of the idea.

To me this has always looked like premature optimization. Like putting the cart before the horse.

nahname · on March 6, 2013

Most people 'spike' an initial version to discover what they have to do. It is very hard to test drive something when you have no clue how it is going to work. Could you test drive the code for the very first controller action you ever wrote? No. But the second controller action should be easy to TDD.

kamaal · on March 6, 2013

Sorry when I look at a problem, I think about the solution for the problem. And not testing the solution.

nahname · on March 6, 2013

I am not sure how they are mutually exclusive. You are going to test that your solution works. Switching from manual confirmation to automatic confirmation isn't so difficult. The real trick is figuring out your expectations before you write you code. Which makes sense when you think about it.

When this code works, I will see the username show up in the view.

Versus.

assigns(:username).should == "expected"

I realize this example is trivialized, but these are the usually examples people start with when they are learning how to start with tests instead of code.

hakaaaaak · on March 6, 2013

I will start doing TDD all the time when:

1. It is faster than developing without it.

2. It doesn't result in a ton of brittle tests that can't survive an upgrade or massive change in the API that is already enough trouble to manage on the implementation-side- even though there may be no functional changes!

Some other thoughts:

Unit tests that test trivial methods are evil because the LOC count goes up (maintenance anyone?) and the number of entry points and NPE possibilities or checks goes up (bugs anyone?) -> TDD promotes testing trivial methods -> TDD promotes evil

TDD increases the chance that more people will mock, and mocking can lead to brittle tests -> TDD increases the chance many of these brittle tests will be written -> Lots of brittle tests means that you throw them away later or rewrite the whole app (with more tests!)

TDD promotes 100% test coverage of the code you write -> Very, very few successful companies have 100% test coverage -> Code with 100% test coverage has brittle tests (period) -> TDD promotes things that are not best-of-breed practices in a quest for the false god of 100% test coverage.

I was a firm believer in what Kent Beck, Ron Jeffries, et al were pushing in the early part of the last decade. But since then I think most of the rest of the world already knows that TDD practiced religiously will lead to huge amounts of code that slow... down... development... and... make... it... easier... to... buffer... estimates... because when absolutely required you can stop TDD and just hack a spike into production. And... when the application needs to be rewritten because it is too crazy complicated to change all of those tests- you just rewrite it, and we all love greenfield development!

boyter · on March 6, 2013

1. It can be. For something like a standard C# MVC application (Im working on one now) the time taken to spin up Casini or deploy to IIS is far greater then running tests. For something like PHP where you are just hitting F5 and TDD can slow you down. As with most things it depends.

2. If you are writing brittle tests you are doing it wrong.

Increasing LOC isn't always a bad thing. If those increased LOC improve quality then I consider it a worthwhile. Yes it can be more maintenance, but we know the cost of catching bugs in development is much cheaper then in production.

Mocking isn't as bad as its been made out to be. Yes you can overmock things (a design anti-pattern), but that should be a sign of code smell and you should be re-factoring to make it simpler. If you cant re-factor and you cant easily mock then consider if you really need to test it. In my experience things that are hard to mock and cannot be re-factored usually shouldn't be tested.

Exception being legacy code, but we are talking about TDD here which usually means greenfield development or else it would have tests already.

Unit testing does NOT promote 100% coverage. People using unit tests as a measure promote this. Sometimes its worth achieving, and sometimes its not. Use common sense when picking a unit test coverage metric. I have written applications with close to 100% coverage such as web-services and been thankful for it when something breaks and I needed to fix it. I have also written applications with no more then 20% over the critical methods (simple CRUD screens). Use common sense, testing simple getters and setters is probably a waste of time so don't do it.

Unit testing isn't all about writing tests. Its also about enforcing good design. Code that's easily testable is usually good code. You don't have to have tests to have testable code, but if you are going to that effort anyway why not add where they can add value and provide you with a nice safety harness?

Most of the issues with unit tests come with people preaching that they are a silver bullet. For specific cases they can provide great value and increase development speed. Personally I will continue to write unit tests, but only where my experience leads me to believe they will provide value.

andrewflnr · on March 6, 2013

Re 1, GP is talking about time to develop features, including writing tests, not to actually run the tests.

Glide · on March 6, 2013

That may or may not be valid depending on what one considers "done" in the context of a project.

My team considers a feature done when it goes through QC and business analysis.

I personally find it nice to just hit a couple of hotkeys instead of working through the application itself in order to test functionality. It does save me time because I want to ensure that the feature works and I have to devote less cycles to fix some small thing I missed when running the real app as opposed to tests.

boyter · on March 6, 2013

As am I. In my sample app its faster to write the feature and the tests and run those then to spin everything up and see it working. I have had the same experience in large Java applications as well.

nahname · on March 6, 2013

1. It is only fast if you are working solo on a project for less than 2 months. Each extra person, subtract two weeks. Think larger scale than just adding a feature. You always have to refactor to add the next feature. The giant mess you created takes time to understand and change. With tests, your abstractions actually work. Focus on what you need to change, change it, fix the failing tests, repeat.

2. If having tests during a major refactoring/redesign slows you down, you are writing the wrong kinds of tests. Changing code should break your tests. Not all of them, but the ones where you have changed the behavior. Unit tests are more valuable than integration (rails controller tests) tests which are more valuable than functional tests (browser/request). If all you have are the later, they will break for everything and provide little feedback as to the cause.

More general tips. Test logical switches, not assignments. If you have to deal with lots of mocking, try to avoid state mutation and use a more functional approach. Mocking is a result of interaction testing.

Code coverage only tells you if you are lacking in quality, not that quality is present. 100% doesn't mean you have good tests. Test what matters, don't test everything.

I realize your argument is for using TDD all the time. The problem is that your complaints are from the standpoint of, I never want to test anything. There are very valid reasons for not using TDD all the time. You just haven't listed any.

hakaaaaak · on March 6, 2013

> 100% doesn't mean you have good tests

That's true, but if you are really doing TDD the way some understand it (small methods, readily testable), then 100% test complete TDD means you didn't write any code you didn't test intentionally. By 100% coverage, I don't mean a code metric result from a tool, I mean that you followed TDD to a T and didn't write code that you didn't test carefully and intentionally.

The loose form of TDD you mention is not TDD by others' standards. Strict TDD never says, "those tests you wrote to test the app in the browser- those are enough; you don't need to unit test that, because you're testing all the important and edge cases we care about, even without unit tests." Yet, that may result in fewer, less brittle tests that can survive an upgrade or other massive changes in the implementation.

pufuwozu · on March 6, 2013

TDD increases the chance that more people will mock

Absolutely false. Try TDD in Haskell. Know how many mocking libraries there are in Haskell? None. Totally unnecessary.

hakaaaaak · on March 6, 2013

I should have considered Haskell. :) In other languages, it is true, so it isn't absolutely false.

cheald · on March 6, 2013

This is a big part of the reason that I like BDD over unit-oriented TDD. BDD saves you time overall (once you've gotten over the initial hump), and tends to result in high-level tests that survive a major restructuring of internals. In such a system, mocks (with perhaps the exception of external resource mocks) are usually an indicator that you need to refactor something.

That said, I think that unit tests still have their place - you just have to be careful to not test implementation as much as input/output.

corey · on March 6, 2013

> TDD promotes testing trivial methods

Kent Beck might disagree:

http://stackoverflow.com/a/153565

hakaaaaak · on March 6, 2013

:) Kent Beck doesn't promote it anymore, but in the early 2000's that was what was understood.

From "Test Driven Development" by Kent Beck, published 2002: http://www.amazon.com/Test-Driven-Development-Kent-Beck/dp/0...

Here are some excerpts:

"What test do we need first? Looking at the list, the first test looks complicated. Start small or not at all. Multiplication, how hard could that be? We'll work on that one first."

The examples like this that were provided were of trivial methods. Trivial can be subjective, but, to me, methods whose bodies are almost always 2-5 lines long, not counting calls to other trivial methods, which may account for maybe another 2-3 additional lines, are basically trivial. It's one thing when you really don't need that much code, but it's another when you have classes upon classes upon classes upon classes, etc. to do something that could be OO and be in two classes with 1/2 as many lines and still be clear.

"Do these steps seem small to you? Remember TDD is not about taking teeny-tiny steps, it's about being able to take teeny-tiny steps. Would I code day-to-day with steps this small? No. But when things get the least bit wierd, I'm glad I can."

This is what Kent meant, but few of us picked up on it, per the first comment to his answer in your example from S.O. and per my experience.

I'm not blaming Kent, Ron, etc. or even saying that they are or were wrong. But, the commonly understood epitome of TDD used to be 100% test coverage and methods for just about everything. Those in the know said more like 60% was better overall but that was not really "true TDD". The implication of 100% "trivial" methods is more overhead for method calls (minor cost, depending, but can be cumulative, increase/decrease call stack size more quickly, which is inefficient), a large number of entry points (more stuff can be null/nil, cause NPE's or need checks if you don't know what is going to call it later, though that can be mitigated somewhat), and just generally too many LOC.

bjterry · on March 6, 2013

My company's website code is probably about half Haskell (using Yesod, for all the backend code) and half Javascript (for the single-page app frontend). Despite the nuances described by the author, I really think there is a definite substitution the robust type system vs. unit testing. On the Haskell side, I don't have unit tests, and don't have many tests at all. The number of bugs I discover in the Haskell code is very small in spite of this. I know people have said it before, but when something passes the type checker, most of the time it literally just works.

The contrast with the Javascript side couldn't be more stark. The Javascript side is much more challenging to get to a bug-free state, and some of the remaining bugs defy replication, etc. I think to get a system that has a fraction of the reliability afforded by the type system, you really need extensive unit tests.

I'm not contradicting the blog author, but I think it's important to keep in mind that in broad strokes, those two aspects really do tradeoff.

pufuwozu · on March 5, 2013

Author here. Sadly our WordPress didn't have caching enabled so here's a Gist of the content:

https://gist.github.com/pufuwozu/5095510

jamieb · on March 6, 2013

FTA: "Well, hopefully we almost always want the asymptotically optimal algorithm to solve our problem. You might have noticed that the above code is selection sort - an algorithm with best/worse/average time complexity of O(n^2)."

"Sadly, we can't write a test or a type to satisfy our specification. We need to actually perform some (asymptotic) analysis to derive our code, instead of relying on tests!"

Algorithmic complexity is not measurable by the test code, but its also not measurable by your customers. So why measure it? Instead, write a test for something that your customers do care about: performance perhaps? Its easy to throw a huge array at it and see if its too slow.

tensor · on March 6, 2013

For one, you would have to throw arrays of different sizes at it and produce a plot to see O(n^ 2) type growth. But in general that still doesn't tell you what the complexity analysis does. The worst case could depend on a particular input type. E.g. one containing a particular pattern. You won't know to test that without doing the analysis though.

Also, worst case complexity is certainly measurable by customers. They notice when your product becomes usably slow either over time or perhaps suddenly when given a particular input.

mikescar · on March 6, 2013

It's not a complex algorithm, it's a static website run from wordpress that could be faster by any of these: S3, CloudFront, memcache, APC, or even file caching.

Yes, you could spin up any number of AWS to throw ab / httperf traffic at it, but why? HN has demonstrated how inadequate their server solution is.

pufuwozu · on March 6, 2013

It's too slow. Now what?

You have to use asymptotic analysis to figure out that it's O(n^3) but can actually be performed in O(n) - then write the derived code from your analysis. Nothing has changed.

Customers care; transitively.

kar1181 · on March 6, 2013

This isn't really a new thing though, is it. There is the (in)famous example of Ron Jeffries trying to solve Sudoku through TDD (and failing miserably). http://xprogramming.com/articles/oksudoku/

Compare and contrast his impotent flailing with Peter Norvig's masterful approach. http://norvig.com/sudoku.html

Of course for the lulz you had people like Spolsky and Martin all use it to set out their view of software engineering with essays like Duct Tape Programming and Echoes from the Stone Age.

Anyway ... The conclusion from it all was that TDD works best as a design tool for interface discovery. That you get a regression suite, of sorts, is a side effect.

It doesn't work as a specific optimization tool or algorithm discovery tool. Which isn't really surprising as these are two of the few remaining 'non-mechanical' aspects of programming.

swanson · on March 6, 2013

Could someone link me to a consumer web app with standard features (generating HTML, talking to a database, making requests to 3rd party services) that uses one of these typed functional languages (Haskell, Clojure, whatever)?

I'm genuinely curious what this code would look like - every example I've ever seen goes something like "Types are the best thing ever! Here's an example: sorting a list! making a stack/queue! See it's GUARANTEED to work! Math and stuff!". The closest I've seen is the Gilded Rose kata, but that doesn't seem to handle UI/database.

And, please, don't say "just use an I/O monad!" - that isn't helpful to me.

pufuwozu · on March 6, 2013

Author here. I use Haskell for a lot of webapps. Here's a webapp that talks to GitHub:

http://licentious.herokuapp.com/

https://github.com/pufuwozu/licentious

Here's a programming language competition that I organise:

http://www.pltgames.com/

The web framework that I use is Yesod. Over the years I've used CakePHP, Django, Rails, Lift and Play! - Yesod is definitely the BEST framework I've used:

http://www.yesodweb.com/

swanson · on March 6, 2013

Thanks - this is helpful.

biscarch · on March 6, 2013

I have to agree with this. Currently prototyping a project in Clojure/Compojure but I'm moving the project to Yesod for the production system.

runT1ME · on March 6, 2013

Clojure isn't typed, and Scala is a functional language, is used extensively at Twitter and FourSquare, LinkedIn from what I heard, some at Yammer, and a ton of other companies.

xavi · on March 7, 2013

Here's a complete authentication web app that uses Clojure

https://github.com/xavi/noir-auth-app

It generates HTML using Enlive (for complete separation of code and markup), it talks to a MongoDB database (using CongoMongo). It also has a little bit of ClojureScript (I think that being able to use the same language in both client and server is a very good benefit of using Clojure for web apps, especially if the people developing the front-end and the back-end are the same).

Sssnake · on March 6, 2013

Sure, here's a haskell web app written with snap. I wouldn't take too much from the yesod example you were given, yesod is not idiomatic haskell at all. https://github.com/chrisdone/hpaste

metajack · on March 5, 2013

I always liked Fogus's post "Not Enough" on this topic:

http://blog.fogus.me/2012/06/20/not-enough/

hakaaaaak · on March 6, 2013

LDD - Love Driven Development sounds awesome

skatenerd · on March 5, 2013

Looks like precog.com doesn't have much interest in load testing either.

misnome · on March 5, 2013

One would have thought that they would have been able to anticipate this.......

jasonjei · on March 6, 2013

Often times, the goal of TDD is often confused with spec testing. I think 37signals is right when they say TDD development is like TSA's security check process...

zeckalpha · on March 6, 2013

I've played around with roy. Good to see it getting some press!

More on topic, what about Contract DD as an alternative to Test DD and Type DD? e.g. https://github.com/disnet/contracts.coffee

digsmahler · on March 6, 2013

> Sadly, we can't write a test or a type to satisfy our specification. We need to actually perform some (asymptotic) analysis to derive our code, instead of relying on tests!

Actually, this test can be written without too much trouble. Take some different sized data sets and time the sort operations. Do some analysis of your data to see that the data points fit acceptably close to an asymptotic curve.

ternaryoperator · on March 6, 2013

>Neither TDD nor types are the single answer to writing code that does what we want.

This is an oddly persistent strawman. Is there any seasoned developer who believes there is a single answer to this problem?

papsosouid · on March 6, 2013

Unfortunately, it certainly seems like there are a lot of them. "Types are pointless because you have unit tests" is a common refrain from the dynamic language camp.

mikescar · on March 6, 2013

"Where our website fails at serving pages" ... can we stop these 'authoritative' posts that can't even serve a static webpage? Ugh.

martinced · on March 5, 2013

TFA begins by explaining that it's going to explain why why do we need static types although we have TDD.

However these are two orthogonal concepts and the first paragraph of TFA makes no sense.

Also TFA seems to imply that using static types + TDD means we can have programs provably bug-free.

Not so fast there: any bug in production in a TDD+static types is a bug that "escaped" the type system and that didn't get caught by TDD.

And there are many such bugs in production for TDD + static types projects.

I'd say TFA is very poorly trying to make its point (not that the author doesn't have one: but the 'talk' is not nearly anywhere near close the 'code' in TFA).

tikhonj · on March 6, 2013

You've misread the article. It implies no such thing.

The first point is that the type system can prove some functions and properties correct. For these, you don't need tests--no questions there. This doesn't mean you never need tests, but it does mean a type system can replace several kinds of tests completely.

If you can actually prove something with the type system, this is strictly better than using tests and tests are unnecessary there.

It also doesn't mean the type system will magically stop all bugs. Rather, it means the type system will stop all possible bugs of a particular sort. Which is very valuable.

The second half of the article has nothing to do with types. It is an example of where tests are insufficient. A reasonable and practical example, at that. It's completely contradictory to what you claim the article says: the whole section is about something neither types nor tests can catch!

It's all best summed up with the final line:

> TDD can be useful. Types can be useful. Analysis is necessary.

I'm not sure how you can read this as "tests + types produce bug free code" because it clearly says that they're insufficient.

pufuwozu · on March 6, 2013

Did you read the conclusion of TFA?

> Analysis is the method for writing code that satisfies our constraints. Neither TDD nor types are the single answer to writing code that does what we want.

jacquesm · on March 6, 2013

Besides that there are many classes of bugs and that addresses just a few of them.