Python is also reference counted, and this does the bulk of the work - the GC is...

mkj · on Jan 18, 2017

It looks like Instagram's change disabled refcounting too, otherwise they'd still be doing copy-on-read for the refcount variable?

zaphar · on Jan 18, 2017

They abandoned that change because it didn't solve their problem.

dolftax · on Jan 24, 2017

Isn't that costly? Because you're wiping everything including memory that could be reused, including the very executable being loaded into memory and other such things that goes into a process and running all the process' initialization code again.

noobermin · on Jan 18, 2017

Question, but as someone who doesn't do this sort of work...is this typical? That things would balloon that requires you to periodically kill things sounds like fuzzy logic somewhere to me.

I get that software is complex and people have simple deadlines...

toast0 · on Jan 18, 2017

This is pretty common for a forked server model. You already need to handle the case where the worker process dies, so it's simple to also occasionally kill it on purpose. Memory thresholding is nice, but you also have things like MaxRequestsPerChild from Apache. Restarting the worker after say 100,000 requests is cheaper than profiling / tracking down slow memory leaks. OTOH, when you get down to MaxRequestsPerChild 10, you have clear problems that should be easy to track down; you can also do CPU usage thresholding to limit damage from infinite loops that are hard to reproduce.

sciurus · on Jan 18, 2017

Yes, this is typical. I have never worked somewhere that wrote web applications in Python or Ruby and didn't have their process supervisor kill processes after they exceeded a memory threshold.

barrkel · on Jan 18, 2017

Killing a process is a lot more efficient than GC ever is. If your servers are stateless, it makes far, far more sense to enable killing them safely than engineer the world's most efficient GC.

icebraining · on Jan 19, 2017

The system we use has a "soft" limit; the worker isn't killed outright when it reaches the limit, it just checks after finishing each requests and kills itself if the limit has been exceeded, leading the manager to spawn a fresh worker.