To echo this we have run 10k+ containers in production for close to a year, and docker crashes all the time. Sometimes in ways undetectable by any reasonable health polling mechanism messing up Kube as well.
Note these were not issues with using docker poorly, or out of mainline use cases. They are all issues with dockers core networking or runtime stack that are acknowledged bugs.
For #22502, I'm looking to get in a ring-buffer implementation for log drivers: https://github.com/docker/docker/pull/28762
The same could also be done for attached clients.
#28889 is resolved in 1.12.4
#26492 was found to be not a Docker issue, but if you are still having an issue here, please open a new one, thanks!
We have a workaround in place for 26492 (a program that just queries all containers, and there VETH info and checks if they are in the bridge, force adding them if they are not) luckily so we don't have to build custom versions of Systemd. We also instituted a similar fix for 28518 (we just kill the broken container when it's detected).
However, our integration suite repeatedly caught deadlocks and some panics in 1.12.* so we have yet to upgraded to that. I'll ask our infrastructure teams to do a custom CoreOS with 1.12.4 and run it through the paces; though, I know we've had issues with Kube in the CoreOS alpha channel so it may be a no go.
Edit: Relevant details https://news.ycombinator.com/item?id=13110897
Note these were not issues with using docker poorly, or out of mainline use cases. They are all issues with dockers core networking or runtime stack that are acknowledged bugs.