I have spent way too much of my time as a developer over the years hacking on so...

james_s_tayler · on March 16, 2020

It's a double edged sword. A lot of system failures happen because every single thread winds up accidentally blocking on something that is never going to return for one reason or another. In such cases a timeout is able to unblock the threads.

Maybe the developers collected lots of data and said p99 latency is X, so let's set the timeout X+alpha. If something takes drastically longer than most requests it's probably something going wrong. Maybe your latency was the 99.99999 and the timed you out.

Or yeah the devs just guessed and guessed badly and the timeouts cause more problems than they solve.

vthriller · on March 16, 2020

Consider load balancers that redirect requests to any other server if one of them is taking too much to respond (no matter whether it's slow or gone altogether). Eliminating timeouts there and pushing the problem waaaay back onto the client side, with the user constantly hovering over "cancel" or "retry" button, would actually lead to worse user experience. Imagine pressing "refresh" in you browser every now and then on random internet pages just because they occasionally take soooo much more time to load! (Also, unlike in typical client-side software, timeouts on the server side are often configurable, so that software can better fit in a variety of environments and/or requirements.)

nradov · on March 16, 2020

Sometimes there is no interactive user as such who is capable of manually cancelling an operation. Back and data processing and real-time systems both really need to have reasonable timeouts set on every blocking operation.

gpderetta · on March 16, 2020

I do agree that timeouts are often used as a crutch for bad code, but the use case discussed in the article (preventing a potentially malicious bad client from hogging resources) is legitimate.