Even if you add 1 unit of interference at each processing stage (and since round...

Even if you add 1 unit of interference at each processing stage (and since rounding tends not to be malicious, you may well do better than that), you'd need 128 poorly-implemented processing stages for a 32-bit float to be reduced to mere 16-bit integer precision - but in practice, likely more.

When it comes to clipping or loss of data on the lower end, well, 32-bit floats have an 8 bit exponent (254 reasonable values); that means that the loudest full-precision unclipped signal is 765 dB (!) louder than the softest un-quantized signal. Even with mediocre centering, that's more than enough.

I don't think 64-bit audio is likely to be noticable, even for processing purposes, outside of really specialist kind of niches.