Hacker News new | past | comments | ask | show | jobs | submit login

Arguably, this logic should live in another place that monitors the service.

Especially that service startup failure is usually not something that gets fixed on its own, like a network connection (where exponential backoff is (in)famous). A bad config file, or a failed disk won’t recover in 10 minutes on its own, so systemd’s default makes sense here, I believe.




systemd is a service monitor. It wouldn't be nearly as useful if it wasn't!


From the servers perspective, external problems typically do get fixed on their own. It is nice when resolving the primary issue is sufficient to fix the entire system; instead of needing to resolve the primary issue; then fix all the secondary and tertiary issues.

At my work, we have a simple philosophy for this. The tester is allowed to (on the test system): toggle servers' power; move around network cables; input bad configuration; etc; in any permutation he wants. So long as at the end of the exersise everything is setup correctly the system should function nominally (potentially after a reasonable delay).

There should, of course, be a system level dashboard that notifies someone there is a problem; but that is unrelated to the server internal retry logic.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: