Hacker News new | past | comments | ask | show | jobs | submit login

> at Netflix type scale that nobody is sshing into individual instances to troubleshoot

Have you worked at Netflix?

I haven't, but I have worked with large scale operations, and I wouldn't hesitate to say that the ability to ssh (or other ways to run commands remotely, which are all either built on ssh or likely not as secure and well tested) is absolutely crucial to running at scale.

The more complex and non-heterogenous environments you have, the more likely you are to encounter strange flukes. Handshakes that only fail a fraction of a percent of all times and so on. Multiple products and providers interaction. Tools like tcpdump and eBPF becomes essential.

Why would you want to deploy on a mature operating system such as Linux and not use tools such as eBPF? I know the modern way is just to yolo it and restart stuff that crashes, but as a startup or small scale you have other things to worry about. When you are at scale you really want to understand your performance profile and iron out all the kinks.




Can also use stuff like Datadog NPM/APM that uses eBPF to pick up most of what you need. Its been a long time since I've needed anything else.


Yes, there are numerous other ways to run remote commands than ssh, all of them less secure. (Running commands via your monitoring system can even be a very handy back door in a pinch.)

The argument here was that remote commands was less useful at scale, not that ssh was a particularly bad way of implementing it. Which doesn't make sense. You tend to have more complex system interactions at scale, not less.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: