Forgetting to seed your RNG is a really classic bug. IMHO RNGs should auto seed unless explicitly set not to, but since the opposite behaviour was baked into C so many years ago it's kind of the default. The worst part is how easy a bug this is to miss unless you're explicitly printing out the first set of random numbers for some strange reason.
NumPy does auto-seed the RNG if you don't pass a seed yourself, using platform-specific code to pull some entropy from the OS. So that common case is handled reasonably well, unlike with C. In fact if you want exactly reproducible results (e.g. in testcases), you have to seed with a known seed, to avoid that default behavior.
The issue here is a little more subtle: if you fork 10 copies of your Python process, all 10 inherit the current RNG state, and will thereafter produce identical random number sequences. If you were manually forking, you might guess that was a potential problem, and re-seed the RNGs after forking. But PyTorch's data loaders fork a bunch of processes to do things in parallel, so users might not realize that they're using duplicate copies of their RNG state.
I get the desire to be pedantic, but does anyone at all train DL models on Windows? (barring toy projects for fun and perhaps debugging) The same can be said about num_workers > 0. You _have to_ fork worker threads unless you train something super tiny like MNIST and you load the whole dataset on GPU.
I’m of the opposite opinion and would get away from all auto RNG seeding:
1) this will help reproducibility a great deal, which is a pain so often.
2) forcing users to actually understand the seeding of RNGs from the point that they are novice programmers could help allay bugs of the sort seen in this post, which I believe stems from having too much faith that RNGs will simply work out of the box as substitutions for ‘real’ random variables.
I have again a different opinion. Allow both: srand() - explicit seed initialization, as well as autoseed.
But you really need to change selected known bad seeds, which destroy the PRNG statistical properties. Most PRNG's have a couple of known bad seeds, but nobody does anything against it. Same for hash functions.