Hacker News new | past | comments | ask | show | jobs | submit login

I have spent way too much of my time as a developer over the years hacking on software to remove ill-conceived timeouts where some developer said--sometimes not even in one place but for some insane reason at every single level of the entire codebase--"this operation couldn't possibly take longer than 10 seconds"... and then it does, because my video is longer than they expected or I have more files in a single directory than they expected or my network is slower than they expected (whether because I have more packet loss or more competition or more indirection) or my filesystem had more errors to fix during fsck than they expected or I had activated more plugins than they expected or I had installed more fonts than they expected or I had more email that matches my search query than they expected or more people tried to follow me than then expected (for months back when Instagram was new I seriously couldn't open the Instagram app because it usually took more than the magic 10 seconds--an arbitrary timeout from Apple--to load my pending follower request list for my private account; the information would get increasingly cached every load so if I ran the app over and over again eventually it would work) or my DNS configuration was more broken than they expected or I had a more difficult-to-use keyboard than they expected or I had more layers of security on my credit card than they expected or any number things that they didn't expect (can you appreciate how increasingly specific these examples started becoming, as I started having horrifying flashbacks of timeouts I had to remove because some idiot developer decided they could predict how long something could take and then aborted the operation, which seems like the worst possible way of handling that situation? :/). Providing the user a way to cancel something is great, but programming environments should make timeouts maximally difficult to implement, preferably so complex that no one ever implements them at all (and yes, I appreciate that this is a pipe dream, as a powerful abstraction tends to make timeouts sadly so easy people strew them around liberally... but certainly no timeout arguments should be provided on any APIs lest someone arbitrarily guess "10 seconds"): if the user, all the way up at the top of the stack, wants to give up, they can press a cancel button. And to be clear: I don't think timeouts are something mostly just amateur programmers tend to get wrong and which can be used effectively by experts (as is the case with goto statements or random access memory or multiple inheritance)... I have never seen a timeout--a true "timeout" mind you, as opposed to an idempotent retry (where the first operation is allowed to still be in flight and the second will, without restating, merge with the first attempt as opposed to causing a stampede; these make sense when you have lossy networks, for example)--in a piece of software that was a feature instead of a bug, where the software would not have been trivially improved by nothing more than simply deleting the timeout, and I would almost go so far as to say they are theoretically unsound.



It's a double edged sword. A lot of system failures happen because every single thread winds up accidentally blocking on something that is never going to return for one reason or another. In such cases a timeout is able to unblock the threads.

Maybe the developers collected lots of data and said p99 latency is X, so let's set the timeout X+alpha. If something takes drastically longer than most requests it's probably something going wrong. Maybe your latency was the 99.99999 and the timed you out.

Or yeah the devs just guessed and guessed badly and the timeouts cause more problems than they solve.


Consider load balancers that redirect requests to any other server if one of them is taking too much to respond (no matter whether it's slow or gone altogether). Eliminating timeouts there and pushing the problem waaaay back onto the client side, with the user constantly hovering over "cancel" or "retry" button, would actually lead to worse user experience. Imagine pressing "refresh" in you browser every now and then on random internet pages just because they occasionally take soooo much more time to load! (Also, unlike in typical client-side software, timeouts on the server side are often configurable, so that software can better fit in a variety of environments and/or requirements.)


Sometimes there is no interactive user as such who is capable of manually cancelling an operation. Back and data processing and real-time systems both really need to have reasonable timeouts set on every blocking operation.


I do agree that timeouts are often used as a crutch for bad code, but the use case discussed in the article (preventing a potentially malicious bad client from hogging resources) is legitimate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: