It has to do with how abstract and parametric the code was. There may very well be only one valid type-checking program which exists (so long as you do a cursory check against infinite loops and error throwing).
So, it's a gamble---do there exist any other non-trivial programs at this type?
If not, then I know my code written is correct without testing.
If there are, then I'm risking that I found the right one.
This is why Haskellers get so excited about "parametricity". It's the core tool that exacts this property of "there are only a small number of programs which could ever exist at this type" which drives correctness.
So, it's a gamble---do there exist any other non-trivial programs at this type?
If not, then I know my code written is correct without testing.
If there are, then I'm risking that I found the right one.
This is why Haskellers get so excited about "parametricity". It's the core tool that exacts this property of "there are only a small number of programs which could ever exist at this type" which drives correctness.