Hacker News new | past | comments | ask | show | jobs | submit login

The "let it crash" philosophy assumes that there is some external system monitoring & restarting the program that crashes. They mention this explicitly in the article, but it's worth repeating: you need this external system anyway. Your program may stop executing for all sorts of reasons other than a bug in your program, from bugs in your dependencies to uncaught errors to infinite loops to cosmic rays to someone tripping over the power cord to an earthquake destroying the entire U.S. west coast. Your distributed system needs to handle these as operational errors, and in extreme cases you might not even have power available for 1000 miles; there is no possible way that a single process could recover from that.

They also recommend configuring Node to dump core on programmer error, which includes (literally) all of the diagnostic information available on the server.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: