Hacker News new | past | comments | ask | show | jobs | submit login

Isn't noisy neighbor less of a problem on AWS since nitro? Where I work we monitor that with some CPU steal metrics and it's very rare to see it nowdays.



In this case the noise is coming from inside the house, er, the VM so it's not Nitro's problem.


Yep. In Netflix case each Titus host can run hundreds of containers per bare-metal instance at any given time. One advantage of running a multi-tenant platform like this is that you get better observability on multi-tenancy issues since you're doing the scheduling yourself and know who is collocated with who. It's much harder to debug noisy-neighbor issues when it's happening on the cloud provider side and your caches get thrashed by random other AWS customers.

One thing I was pitching internally when advocating for this platform is that when you have the scale to run it for the economics to make sense, you can reclaim some of AWS margins instead of having your cold tiny VMs subsidize other AWS customers higher perf. If you run the multi-tenant platform yourself, you can oversubscribe every app in a way that makes sense for your business and trade latency or throughput of software for $ on a per-container basis, so you can make much more granular and optimal decisions globally. VS having each team individually right-size their own app deployed on VMs and sharing CPU caches with randos.

I remember once at Netflix we investigated a weird latency issue on a random load balancer instance and got AWS involved: it turned out to be a noisy-neighbor on the underlying VM that gets chopped up into multiple customer-facing LB instances.


Aside: Is titus still being developed?

GitHub repo says it was archived 2 years ago: https://github.com/Netflix/titus




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: