Hacker News new | past | comments | ask | show | jobs | submit login

Isn't that already the case?

I thought that when you use multiprocessing in Python, a new process gets forked, and while each new process has separate virtual memory, that virtual memory points to the same physical location until the process tries to write to it (i.e. copy-on-write)?




That's true but running VMs mutate their heaps, both managed and malloced. CoW also only works from parent to child. You can't share mutable memory this way.

Empty space in internal pages gets used allocating new objects, refence counts updated or GC flags get flipped etc, and it just takes one write in each 4kb page to trigger a whole page copy.

It doesn't take long before a busy web worker etc will cause a huge chunk of the memory to be copied into the child.

There are definitely ways to make it much more effective like this work by Instagram that went into Python 3.7: https://instagram-engineering.com/copy-on-write-friendly-pyt...


Yes, the problem is sharing data between parent and child after the parent process has been forked.


Yes, sharing pre-fork data is as old as fork().

Sharing post-fork data is where it gets interesting.


If you have 4 cores, you may want to spaw 4 children, then share stuff between them. Not just top-down.

E.G: live settings, cached values, white/black lists, etc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: