Hacker News new | past | comments | ask | show | jobs | submit login

I've found that writing randomized unit tests for each small part of a system finds this sort of stuff immediately.

In this case, a test that just generated 1,000,000 random strings and passed them to punycode would probably have found the problem (maybe not in the first run, but after a week or two in continuous integration).

I try to structure the tests so they can run with dumb random input or with coverage-guided input from a fuzzer. The former usually finds >90% of the bugs the fuzzer would, and does so 100x faster, so it's better than fuzzing during development, but worse during nightly testing.

One other advantage for dumb random input is that it works with distributed systems and things written in multiple languages (where coverage information isn't readily available).




I really like this idea because it avoids the issue of fuzzers needing to burn tons of CPU just to get down to the actual domain logic, which can have really thorny bugs. Per usual, the idea of "unit" gets contentious quickly, but with appropriate tooling I could foresee adding annotations to code that leverage random input, property-based testing, and a user-defined dictionary of known weird inputs.


> maybe not in the first run, but after a week or two in continuous integration

You'd use a different seed for each CI run??

That sounds like a nightmare of non-determinism, and lost of trust in CI system in general.


Not if you log the seed of the failing runs


Yep; I definitely log the seed and re-seed every few seconds.

Most software I work on these days is non-deterministic anyway (it involves the network, etc.), so CI is fundamentally going to fail some runs and not others.

Even stuff like deterministic simulation has this property: Those test suites rely on having a large number of randomly generated schedules, so there's always a chance that running the test one more time (with a new seed) will find a new bug.


If you’re using Git, I’d strongly recommend the hash of the current tree (not the commit). That way your tests are deterministic based on the contents of your tree. For example, if you add a commit and then revert, you’ll end up with the same test seed as if you hadn’t committed.


The important thing to log is the seed passed to the rng. (It’s usually wallclock time measured at nanosecond granularity.)

In a typical night for a production quality system, 100-1000+ hours of such tests would run, all with different seeds, and in diverse configurations, etc, so the seed isn’t derived from the git sha, or the source code.


Or someone just punches retry


Ignoring test failures is a different issue. It can range from known bugs and rational tradeoffs to a hiring/hr issue.

Establishing a healthy dev/engineering culture is hard.


You don't have auto retries anywhere in your tests?


aka property testing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: