Hacker News new | past | comments | ask | show | jobs | submit login

> People dramatically overestimate the correctness of their [C/C++] code and underestimate the potential severity of failures to meet that expectation

Really depends on the field. I heard more than once in my life now that it's better to reboot every night than spend even an afternoon of engineering trying to fix memory leaks.




That's good when it's just about memory leaks. Your missile guidance code can have leaks as long as the missile completes the mission, since it will all be... "garbage collected" anyway.

It's not great when you have problems like use after free and pointers going to where they don't belong. Those can cause security vulnerabilities that rebooting won't fix.


Then they make a version that flies farther and it sometimes randomly fails to detonate.


So you write a note mentioning there's a leak, and debug it if there's a need for it.

Finding a bug doesn't mean you need to fix it, or fix it right away. Using Rust doesn't preclude resource leaks. I'm pretty sure I've managed to run into resource leaks in all of the languages I've used in production, doesn't matter if they were managed or not. Sometimes limiting the lifetime of the thing that holds resources, and let process (or system) death clean it up works great; sometimes that's terrible, depends on the cost of death/restart, the lifetime between deaths, human intervention required, accuracy of lifetime estimation method, probably some other things. I don't worry too much about leaks that require a restart every year or so; nor would I worry about a leak in a missile that will cause issues beyond the achievable maximum flight time.


MongoDB's BSON library calls abort() on malloc failure because they can't be bothered to handle OOM gracefully.


Linux supports overcommitted memory and so will lie to your application about memory availability. When I tested several common open-source applications in a memory-constrained environment none of them, except for the Java VM, handled OOM conditions gracefully. I came to the conclusion it's not worth the programming effort unless your application will specifically expect this condition to occur.


I work on systems that don't run on Linux, and those systems also have reasonable business requirements for graceful failure on OOM. This is unhelpful.


That seems completely reasonable. The only interesting case is "There was far too little storage, so we gave up early" e.g. you need 14GB, there is 186MB available, give up. I probably don't want my photo viewer to crash because I tried to open a JPEG that wouldn't fit in RAM, better it just says that's too big and I can pick a different file.

Whenever people are imagining they're going to recover despite OOM in low level components they are tripping. Their recovery strategy is going to blow up too, and then they're going to be in a huge mess - just abort when this happens.


Postgres is designed to recover on OOM in most cases.


That's nice, how well does it work in practice?


It works. Bugs have been found, and some more bugs probably exist, but I think it meets a fairly high bar of quality.


Firefox and Chrome (and thus Edge, Brave, OperaGX, etc etc) do the same for many allocations - it's safer to crash than to end up in an obscure failure path that never had its error handling exercised and may accidentally be security sensitive.


What else would you expect them to do?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: