I had a very interesting discussion about this subject a couple of weeks ago with a CTO of a big company.
I've commented before that Kube absolutely takes over the bigger more complex cloud installations out there, you can see how many companies are betting their infrastructure on it.
The only thing that I don't see is standardization of the cloud, just like what Amazon did. You see too many companies doing too many of the same things and reinventing the wheel.
Personally, I would love to see smaller installation as a standard of how to take things into the cloud as a cluster. Imagine what Heroku did to deployments. You can't beat this ease of use. Deis and Convox are both trying but not really "hitting it".
As for Walmart, absolutely stoked seeing it from them. This move and what the white-house did with the digital shows a lot of promise and hope. I wonder how much of this is on top of "older" management and how much is just complete restructuring.
Joyent and Distelli make this about as painless as possible with any current workflow. Docker containers can be launched directly in any number or size to Joyent, and Distelli will app-ify and/or dockerize your app as part of the CI/CD workflow and watch the processes.
Any custom script is just bash, and I can deploy to any OS.
I have a cluster of 5 or 6 instances running 20-30 services in this setup and it's been a dream. I tried to do something similar before, but Cloud66 dropped Joyent. It makes perfect sense because apps can be anywhere between traditionally deployed to fully containerized and still are managed with the same interface and config.
Since it's all open-source my deploy process is portable. But I've evaluated other vendors and processes and all seem more manual and less transparent.
Some virtualization use cases can be solved with kubernetes, but many things an enterprise runs do not work at all in the kube paradigm. The majority of applications are pets (i.e. stateful) and many applications run on Windows or have some other kernel requirements that mismatch the baremetal kubelet.
If you think Kubernetes 'absolutely takes over the more complex cloud installations', you're living in an echo chamber of 12 factor apps that doesn't line up with the majority of what I've seen in big enterprise cloud workloads.
Case in point: Walmart has one of the largest openstack deployments in the world.
We (SAP's internal cloud platform) are running OpenStack on Kubernetes. In particular, I'm working on containerizing Swift, which presents its unique set of challenges, but is progressing well regardless.
For development environments we are running MSSQL (yes that MSSQL, unclustered though as there is no SAN emulation of course), Elasticsearch, Mongo, Kafka, Zookeeper, Redis, Memcached, all in Kubernetes (mostly as Pet Sets) and looking to possibly do this in prod once we are happy with it. The story for state in Kubernetes is improving rapidly.
PetSets are simply ordinally ordered container collections instead of randomly ordered.
The usual concept in Kubernetes is that a container has a name like 'thing-randomsuffix4242'. It is pushing you towards making your pods stateless and not precious so the failure handing logic is simple. If it is blown away for whatever reason can easily be replaced by the scheduler and 'thing-randomsuffix242b' is never far away.
With a predictable ordering you can actually make assumptions you otherwise couldn't. For example, perhaps you want the convention that 'thing-predictable1' is the master and 'thing-rpredictable{2,3}' are the slaves.
Closer to minikube then? That was not at all clear to me from reading the site, where it seemed to be discussing a kubelet on an OS on bare metal or hypervisor, but not managing containers.
Not sure when you tried Convox – maybe they have improved since then – but my experience with it last week has been amazing (migrated our application from Heroku).
It was almost as easy and quick to setup as Heroku. All the docker and AWS config was automatically done by Convox. The only issue I had was with Docker: an old Docker Toolbox installation left some environment variables preventing Docker for Mac from starting.
Granted, I have some previous experience managing deployment on AWS via AWS CLI and dashboard so maybe that's what made me quickly understand Convox concepts.
My only regret is I didn't setup with Convox from the start. :)
First, important to note: You are awesome and doing amazing work. (Link to what I am doing later in this message).
The problem with all projects, not just deis is that the end user needs to know too much.
I launch about 30 projects a year on top of Heroku and it never ceases to amaze me how simple it is. Yes, the applications are simple. Classic app->db. That being said, the setup is stupid simple.
When I started working on the-startup-stack[1], I was amazed on how much you need to do when you start from scratch. Things I forgot I even did since I have a system running in production with incremental changes for 5 years.
You need to configure networking, decide on DNS, decide on so many thing. It's a lot of work.
Here's what I want (and what I am trying to do with the-startup-stack[1]).
0. AWS KeyPair
1. create-stack
2. launch-app
Lot of what's missing is best practices and production-ready environments.
Going back to that same lunch discussion I had with the CTO of a big company doing a replatforming of their infrastructure: Generalization is the hardest thing here. Reasoning on who's the customer and what does their stack look like.
If you give Deis/OpenShift/Kubernetes to the typical YC founder (I am not trying to insult anyone here) that's trying to get their app in their cloud, it's just too much. It's too much of things they don't care about right now. By not caring about it right now they are likely vendor-locking themselves for a very long time.
Agreed. That's the problem with running your own private Heroku; you've gotta know the entire stack to use it.
We had this dream that users can use Deis as a self-serve dev shop but we've been seeing a lot of people interested in running their entire PaaS stack, so the docs definitely reflect the administrator more so than the platform user (like Heroku).
> The only thing that I don't see is standardization of the cloud, just like what Amazon did ... Deis and Convox are trying but not really "hitting it"
My colleagues at Red Hat might disagree, they're working on OpenShift.
My more immediate colleagues at Pivotal, IBM, SAP, Microsoft, Google, Cisco, Dell-EMC, VMWare et al might disagree too. We're working on Cloud Foundry and BOSH.
OpenShift origin looks very promising for sure. That's the thing though, you still see too many people not using it.
Here are things that should be standard (IMHO) and you shouldn't reinvent
1. Underlying infrastructure auto scaling (Google, AWS)
2. Service Discovery
3. DNS (internal and external for multiple sources)
4. Networking
The real issue that I feel no-one really answered is who's this for. If a SRE is the target audience than there's a lot more we're missing as a community.
I think Red Hat's problem with OpenShift is getting out from under the brand-masking power of tech they chose. Too many engineers want to roll their own Docker+Kubernetes platform and underestimate the difficulty of doing so.
The thing is, everyone has a different don't-reinvent list. And they want to be not reinvented in different ways. Then they discover all the things that they didn't realise they'll need to reinvent.
> The real issue that I feel no-one really answered is who's this for.
I see PaaSes as serving three constituencies.
Operators, who wish to christ that Developers would stop making their lives hell by breaking stuff.
Developers, who wish to christ that Operators would stop making their lives hell by blocking stuff.
Business, who wonder why everything takes so long, costs so much and breaks so often.
Maybe (hopefully!) things have improved since I last looked at it, but OpenStack is (was?) a complicated, unorganized, over-engineered conglomeration of independent parts.
I sincerely hope that's not the case anymore but that lasting first impression has stopped me from looking into it since then.
Getting up and running with OpenShift is pretty easy. You can either use the all in one Vagrant VM[1] or you can download the CLI[2] and run `oc cluster up`[3] to install the docker container version.
Either one doesn't give you an "HA" install (probably want a cluster for that). I regularly deliver OpenShift installs, and in most enterprise environments an HA install takes 3-4 weeks, most of which is communicating to all the silos (networking, storage, security, virtualization, etc) what the requirements are.
It depends on what the customer has already. We (Red Hat) support installing OpenShift anywhere that RHEL x86-64 is supported, whether that is bare metal, vms, private/public cloud, etc. For example, our hosted "Heroku like" OpenShift Online is on AWS (https://www.openshift.com/devpreview/).
Yes and no. What we're finding is that you need far fewer operators. But you still need someone to keep an eye on things and run 'bosh deploy' to upgrade the cluster or add a new service. The latter could be automated, but it's one of those things people like to do under supervision.
At Pivotal we run Pivotal Web Services (PWS) -- a version of Cloud Foundry which is usually less than a week behind the current release -- with three shifts of 3-10 people each. Two to five pairs, is how we think of it.
PWS has thousands of VMs and tens of thousands of applications running. Pre-CF, pre-BOSH, an installation of this magnitude would need hundreds of sysadmins to stop it from immediately bursting into pretty but expensive flames.
But in general, you're right. The contract with engineers is "tell me what you want and I'll give it to you". Cloud Foundry does that well.
Standardization of the cloud is what http://www.ucxchange.com is working on out of Chicago. It's an interesting model that I was shown recently.
The basic premise is that with standardization of hosting environment platforms an exchange is able to offer their customers a multitude of vendors who compete on price and will have the same features within their environments.
The most interesting opportunity with this is for resellers of cloud computing who can sell compute resources to customers at full price but only pay the exchange based on what they use. It won't last long but if it works resellers will make a fortune migrating customers from big names like Amazon to other big names like IBM through UCX.
> Deis and Convox are both trying but not really "hitting it".
I'm heavily biased towards Deis as I've been using it for a while (and even wrote an UI for it[0]), but what do you feel it is missing? Feature wise, it's almost at parity with Heroku and stability wise, it's not perfect but it doesn't require a PhD in DevOps to maintain either.
What's wrong with NFS? If you need a shared, POSIX-compliant drive between multiple hosts, there's not really an alternative. Even the fairly new AWS Elastic File System service is just a NFS4.1-based managed service
While the grandparent comment might be a little too emphatic, surrendering the flexibility of file stores in favour of object stores does lend itself to permitting nice non-functional properties.
Some of the file systems come with native clients. I'd say for container environments that's a viable alternative to NFS. When done properly the native client can do low latency as it usually has the minimum number of network hops and do failover without requiring things like virtual IPs.
A SAN gives you block storage - virtual hard disks. You will need a cluster filesystem to be able to use the same virtual disks on multiple machines concurrently, which are not easy to operate and have their own bunch of performance problems.
Speaking as someone who has maintained some rather large NFS utilizing systems/datacenters in the last decade, and found it perfectly suitable to my needs, I'd be curious to why you'd say that?
The intr mount option has been a no-op for 8 years.
/bin/umount does an fstat before the umount syscall, so if an NFS mount is broken, it will probably hang forever instead of unmounting it! You can invoke the syscall directly to get the behavior you need.
That hasn't been dependable in my experience - the best workaround I've found is either a lazy unmount + remount if you can tolerate the hung process taking the full NFS retry timeout to unblock or downing the network interface to accelerate the process.
This is fair to a degree, but vendors have VIPs that make failures fairly transparent. But if you need low latency on a large data set is there something better? Where low latency means <100ma reads
throwing away 30+ years of protocol development for the latest and greatest containerization paradigm is what the cool kids south of market street talk about at happy hour.
If you want volumes that can be dynamically attached to any machine, you're not buying a SAN, and you're not in a cloud provider with a portable block storage abstraction, what do you propose?
About the only thing I can think of is Ceph, and it stands to reason there is a lot more NFS than Ceph expertise on the job market.
Wal-Mart is a very odd to work with from the vendor side. They have a purchase order tracking system named PULSE that is completely independent from their EDI system and their supplier Retail Link program (which is the slowest, most horribly designed website you could imagine).
You're required to SFTP up a CSV file of what you've shipped to them that day, along with information about. They also have a creaky acknowledgement/acceptance procedure and the technical folks (seems outsourced) aren't very impressive. You have to pass a 'visual inspection' with your test files and it's the most ridiculous process you can imagine with box-checkers flagging you for the dumbest reasons.
So from this side of the transaction it always makes me wonder about articles like this, all their good tech must be internal only. To be fair to Wal-Mart the situation with other big store chains usually isn't any better.
This is one of the better bespoke cloud efforts I've seen, but in their position -- given the decision to bet on Kubernetes -- I might've chosen OpenShift instead of rolling my own PaaS.
I've also seen the Jenkins-to-Nexus thing a few times, never particularly happily. That said, I don't have particularly deep experience in Java shops, so it's possible that it works really well in some places.
Disclosure: I work for Pivotal, we're the majority donors of engineering to Cloud Foundry, a PaaS competing with OpenShift.
I've commented before that Kube absolutely takes over the bigger more complex cloud installations out there, you can see how many companies are betting their infrastructure on it.
The only thing that I don't see is standardization of the cloud, just like what Amazon did. You see too many companies doing too many of the same things and reinventing the wheel.
Personally, I would love to see smaller installation as a standard of how to take things into the cloud as a cluster. Imagine what Heroku did to deployments. You can't beat this ease of use. Deis and Convox are both trying but not really "hitting it".
As for Walmart, absolutely stoked seeing it from them. This move and what the white-house did with the digital shows a lot of promise and hope. I wonder how much of this is on top of "older" management and how much is just complete restructuring.