This is nifty, but not really congruent with my understanding of the "noisy neig...

eigenform · 2024-09-11T20:19:29 1726085969

I thought so too - it seems like this is more about "who is being preempted by who" (although maybe noisy neighbor in the sense of "hogging up CPU time" does often imply "polluting hardware resources" to some degree, especially considering these machines probably have SMT)

mochomocha · 2024-09-12T00:39:33 1726101573

Yep in Netflix case they pack bare-metal instances with a very large amount of containers and oversubscribe them (similar to what Borg reports: hundreds of containers per VM is common), so there are always more runnable threads than CPUs and your runqueues fill up.

otterley · 2024-09-12T05:50:46 1726120246

I'm curious as to the capacity of the bare metal hosts you operate such that you can oversubscribe CPU without exhausting memory first or forcing processes to swap (which leads to significantly worse latency than typical scheduling delays). My experience is that most machines end up being memory bound because modern software—especially Java workloads, which I know Netflix runs a lot of—can be profligate memory consumers.

XorNot · 2024-09-12T08:09:07 1726128547

If you're min-maxing cost it seems doable? 1TB+ RAM servers aren't that expensive.

jeffbee · 2024-09-12T15:59:31 1726156771

Workloads tend to average out if you pack dozens or hundreds into one host. Some need more CPU and some need more memory, but some average ratio emerges ... I like 4GB/core.