I've noticed that solving a sudoku is something of a rosetta stone for how people program. Norvig's solution is very straightforward in hindsight, but I think that code was either the result of many iterations or of a career of making progressively more readable code.
Here are some other sudoku solvers that give more insight into the coder than the problem:
This article was my crash course for computer science (I come from another engineering field). I learned so much by studying this code, and understanding it allowed me to write a sudoku generator. The generator algorithm is much different than solving, but the basic structure was already there for me.
In any case, I'm replying to your comment because it applies to me. Norvig's code is compact and elegant, whereas mine is spaghetti code at best. I basically have lots of control statements, as there are many many processing steps that I painstakingly discovered as I went along (thousands of lines). Heck, I can barely remember them myself.
To be fair, the reason Norvig was so successful was he knew exactly the sort of problem soduku was and how to solve it, whereas Ron lacked this background knowledge.
I think you're missing the point. TDD forces you to choose data representations that best fit your tests. But in most cases of complex programming you want to choose representations that reflect your current understanding of the problem and then evolve them as your understanding evolves.
In TDD your test will always predate your understanding. So it forces you to solve (part of) the problem in your head first, rather than combine the act of problem-solving and writing. And you do this over and over again. And when you change your mind about something, you have twice the code to refactor.
It's a difference between recording your thinking in code and actually thinking in code.
Prime factor kata is often used as an introductory example of TDD, yet it is clearly impractical and produces garbage code. Not to mention that I've actually seen people do it, stumble, and hectically reach for their notes to see what's the next step. And it's a pretty darn simple problem.
Err, I'm missing the point of my own point? You raise an interesting point, but I don't think TDD was what stumped Ron. It's not a simple problem to solve without a lot of thought unless you have the requisite background knowledge like constraint problems. TDD did however have him going round in circles, partly for the reasons you describe.
The odd thing is that, on the evidence of those blog posts, Jeffries, doesn't appear to understand TDD. If you're using TDD to write a sudoku solver, surely the first thing you should write is a test for your "solve_sudoku" function; then your design is supposed to evolve out of the code you write to pass that test (and the further tests you write for parts of that solution, iteratively). But Jeffries starts writing a bunch of tests for a the low-level details of the representation of a sudoku board. There isn't really any "design" in those posts at all; just a guess at something you might use in a solution to the problem.
Here are some other sudoku solvers that give more insight into the coder than the problem:
Sudoku in APL: https://www.youtube.com/watch?v=DmT80OseAGs
Test driven sudoku: http://ronjeffries.com/xprog/articles/sudokumusings/