Hacker News new | past | comments | ask | show | jobs | submit login

This is only good advice if you have a really good handle on what "too painful" is. Interestingly, this isn't just a matter of operations either. On the software development side we have exactly the same problem. People are unwilling to invest in improving their build systems, test systems, deployment systems, etc, etc. You end up never hitting the sweet spot where you can rapidly advance your capability.

For example, if it takes 3 hours to deploy using a manual system that is prone to errors, then you will only deploy once a week. There is no incentive to improve the system, because you think you can only save 3 hours a week. But the impact is far greater than that. If you can only deploy once a week, then that deployment must be successful. It's really hard to recover from a mistake. So you need to take a far more rigorous approach for testing. It also means that you have to be far more conservative in what you do. You have to take smaller pieces.

The result is that you have substituted your programmers coding time with planning time. They have a very small amount of code they can deliver (to keep the risk low) and they can only do it once a week. And instead of writing code, they are sitting in meetings arguing about the best way to ensure that nothing gets broken.

If you can get your deployment completely automated with a good regression test suite and an ability to easily roll back mistakes -- then you can take on much more risk in development. Instead of playing a virtual game of Jenga with your servers and being very, very, very careful not to break things, your programmers are concentrating on writing code.

Of course it's not free. The programmers must spend quite a lot of their time building infrastructure. You often (almost always???) find organisations trying to control the amount of infrastructure work from the top down. The idea is to focus effort on revenue generating activity. However, the decisions are usually made in the wrong place because often the best balance is not visible by those who don't do the work.

This post is long enough, but I've also found that avoidance of automation in operations can cause serious long lasting damage to an organisation. A really good example is reporting -- which seems like a good candidate for going manual as long as possible. However, as the organisation gets older, and the IT systems get more complex, you get to the point where you need a good, flexible, and accurate reporting system but that the IT systems have absolutely no capability of delivering it. So if you never ask the question, "At what point will it start to be painful to add reporting capability to this system" the chances are that you will completely miss it. And even if wait until the last minute, the chances are that you will have no free capacity to add that functionality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: