Python is also reference counted, and this does the bulk of the work - the GC is just for things that were missed. Instagram has the process that spins up the Python works kill and replace any that eventually use of the allowed threshold of memory.
Isn't that costly? Because you're wiping everything including memory that could be reused, including the very executable being loaded into memory and other such things that goes into a process and running all the process' initialization code again.
Question, but as someone who doesn't do this sort of work...is this typical? That things would balloon that requires you to periodically kill things sounds like fuzzy logic somewhere to me.
I get that software is complex and people have simple deadlines...
This is pretty common for a forked server model. You already need to handle the case where the worker process dies, so it's simple to also occasionally kill it on purpose. Memory thresholding is nice, but you also have things like MaxRequestsPerChild from Apache. Restarting the worker after say 100,000 requests is cheaper than profiling / tracking down slow memory leaks. OTOH, when you get down to MaxRequestsPerChild 10, you have clear problems that should be easy to track down; you can also do CPU usage thresholding to limit damage from infinite loops that are hard to reproduce.
Yes, this is typical. I have never worked somewhere that wrote web applications in Python or Ruby and didn't have their process supervisor kill processes after they exceeded a memory threshold.
Killing a process is a lot more efficient than GC ever is. If your servers are stateless, it makes far, far more sense to enable killing them safely than engineer the world's most efficient GC.
The system we use has a "soft" limit; the worker isn't killed outright when it reaches the limit, it just checks after finishing each requests and kills itself if the limit has been exceeded, leading the manager to spawn a fresh worker.