The primary challenge was just reasoning with the template and using Helm at sca...

The primary challenge was just reasoning with the template and using Helm at scale... ie; what exactly did we deploy on those hundreds of varying clusters?

Other issues included; tiller would sometimes become unstable... version mismatch issues between helm local and roller... lack of a clear, outage free canary deployment... we even found cases where helm would not cleanup after itself during a deployment and retain previous config settings within k8s.