Hacker News new | past | comments | ask | show | jobs | submit login

Very interesting note about the reclaiming. Yet another warning when transparently using a NUMA system.

NUMA can be a real pain. You can get a 40% hit on direct memory access, and far worse if you're modifying a cacheline in another processor. On one of our VoIP workloads, we noticed major (250%+) increase in performance and CPU stability after splitting a very thread-intensive process into multiple processes, each set with affinity to a particular core.

OSes try to help you, but it seems like they're primarily concerned with multiple processes, not huge processes like databases. Such processes should become NUMA aware and handle things themselves for best performance.

It might even make sense to ask if you can split the machine on NUMA boundaries and just act like they're separate systems. RAM's getting very cheap, and RAM/core is going up faster than CPU power is (it seems to me, anyways).

Also, is there a reason not to use large pages directly for the mmap'd sets if you know you're going to have them hot at all times? (I assume they read the entire file on start?)




Hi, post author here.

> Also, is there a reason not to use large pages directly for the mmap'd sets if you know you're going to have them hot at all times? (I assume they read the entire file on start?)

We could use large pages directly. But, as I mentioned in the article, the performance gains would be negligible compared to the gains that come from having things in memory in the first place. These are not very large memory systems and the page table / TLB miss overhead doesn't seem to be biting us. We are just following the mantra 'pre-mature optimization is the root of all evil' :)


In my experience, most people don't know they have TLB problems because, effectively, it's always bad.

It's only when you start getting to the metal to see what your hardware is actually capable of that the TLB stands out as a glaring source of inefficiency.

Put another way: yeah, the TLB is making your app slow, but it's doing so always, so you don't notice. Instead, you mistakenly think your hardware is just slower than it really is.


One correction, in the Linux community, they are generally referred to as Huge Pages.


I guess my Windows bias is showing through. Let's split it and call 2MB pages large, and 1GB huge. (Yeah, I know 1GB pages only have real hardware support in really recent processors.)


touche, you get an upvote from me :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: