Spin up a VM on each cloud (and in house) and run a stack on that which software glues it together. Redundancy of sorts
The thing is that using a cloud as a glorified hypervisor for home-baked VMs doesn’t work economically. Colo or even running a DC is cheaper. Cloud generally only makes financial sense if you are using the managed services. And of course the instant you do that you sacrifice portability to a greater or lesser extent. This is of course deliberate on the part of the cloud providers.
It makes sense to pick 2 clouds and go all-in on them using their native and managed services to the maximum. You will need to maintain two parallel skillsets to do this. Completely forget about any layer that promises to abstract it, they are all red herrings if not outright snake oil. Whether that’s IBM consultants, or Terraform.
I might end up using Azure of all for their hourly-priced Infiniband clusters. You can get an hour with ~25 TB ram for ~50$, d you have like 5~10GB/(physical)core.
I don't want to use them, but it looks like there might not be any alternatives for only some very few hours on such a cluster.
I was formerly at Docker and this thread of comments is spot on. There are so many head winds to the "multi-cloud" strategy play. The first is it's way more expensive. A common theme was: data science team is using containers on Azure and we want to do something that ACS makes hard for us so tell us why Docker Enterprise. Present Docker Enterprise architecture, workflow and pricing... So, for what you pay $300-500 month were going to ramp that cost up to a couple thousand a month because we're going to make you run 24/7/365 VMs to manage the architecture Microsoft has gotten the economy of scale from and you're now tied to a costly license for Docker Enterprise. Now do this across a few clouds simultaneously.
OpenShift isn't any different. And in fact Docker invested a ton of time into an entire product, built around Terraform, that manages ecosystem deployment to any cloud that a lot of people wanted but couldn't buy. In fact that was probably a more sellable product than trying to shove Enterprise Engine and Docker Trusted Repository down people's throats.
If you boil it down the common pattern seems to be: cloud lift and shift of legacy VM from on-prem DC is more expensive. The lift you get from cloud is PaaS and the SaaS that's included to manage it. But very few are cloud native ready and take the perspective that: we'll lift and shift it today and then our next project is cloud native transformation. Yeah... That latter part rarely happens. And AWS, GCP and Azure all profit.
This advice is common (usually it’s pick one cloud, pick two at least is a bit more sensible) but I really am starting to wonder about it being marketing propaganda of its own.
In one breath we hear that only three software vendors (Google, Amazon, Microsoft .. maybe DO?) have software worth consuming, and all other software vendors that ship this stuff “abstracts” them is snake oil... MongoDB Enterprise... Confluent Kafka... Elastic Cloud Enteprise... Pivotal CF, Red Hat Openshift, Hashicorp Nomad / Terraform / Vault etc - all of these provide valuable software that runs on any cloud, and often has a multi cloud control plane (that’s increasingly Kubernetes based).
let’s never use any of that, and screw the whole software industry for the cloud vendors because their stuff isn’t just a proprietary veneer of automation around those products charged by the hour?
on another breath we are told that the cloud providers charge too much for their VMs. And we think they’re not charging too much for their proprietary services?
Firstly, Most “managed cloud services” are not “managed” in the traditional sense. They’re hosted, no different from Dreamhost or a bazillion other hosted offerings, often with similar tradeoffs. It’s a testament to cloud marketing that people believe there is something magical about Amazon RDS for your Postgres instance. It’s a nice automated setup of volume replicated active/passive Pgsql.
There are many, many other ways to do this with open source or proprietary software with varying degrees of automation - maybe you don’t care, that’s fine, but I’m not sure delta between running in a DC/colo vs the EC2 costs is worth it for some proprietary software bits by the cloud vendors. It’s not magic, it’s just software.
Similarly to say all abstractions are snake oil is fashionable but also hypocritical. Kubernetes is on fire lately because it is an abstraction for your work loads, a universal control plane, and universal cloud API. Is that snake oil? Serverless (the framework) makes developing on Lambda or other FaaS’ sane - is it too snake oil? Heroku or Cloud foundry lets you push your apps and not worry about the plumbing on EC2 or the cloud of your choice (even on a colo/DC!)
But most importantly: You’re not locked into lowest common denominator (what does that even mean?) with any of this - you can use any proprietary cloud service you want...the stuff you don’t care about - the VMs, network and storage - is the stuff abstracted (and usually all the proprietary knobs like Azure advanced networking or Google metadata/DNS are all available).
Where is the problem? Are Elastic Beanstalk or Google App Engine really superior economically and functionality wise?
Terraform is not about the different codebases, it’s about glue code in a standard language to assemble all these cloud services. Crossplane.io is trying to do this via Kubernetes CRDs. Would you rather use Cloud Formation and JSON, really?? I’ve seen some monster CF scripts - they’re hard to maintain and debug compared to TF.
Let me give you a trivial example: GCP gives you a lot of flexibility with CPU and memory when creating a VM. Let’s say your workload is ideally suited to some weird combo, like 5 cores and 13G or something. On GCP that’s what you provision and that’s what you pay for. AWS and Azure offer fixed sizes, so you have to round up to 8 and 16 (and pay for it). So if you want to be cloud-agnostic there’s one cool but very basic feature you just can’t use.
Once you start digging into this stuff this keeps coming up: something that’s efficient (cheap) in one but not in another means: do I do it the same and pay more, or do I diverge and accept that I’ve got 2 configurations for this feature now, and save some money. Multiply this by 1000 special cases and then you find that trying to make one size fit all is a wild goose chase.
All Terraform really offers is doing this in similar syntax for the subset of each cloud’s features that Terraform knows about, and extending Terraform yourself for anything it doesn’t. Yes, I would rather use the native thing for each cloud, even CloudFormation. ARM Templates and DSC are actually quite nice once you get used to them!
The wild goose chase part I see as a major exaggeration.
It's interesting, I use BOSH + Terraform all the time, wherein BOSH exposes GCP's CPU/memory flexibility, the various Azure NIC/LB/availability set options, AWS' different disk types, etc. Those differences can be modularized, so that 95% of your configuration templates are identical across clouds, and the last 5% maps to specifics.
I'm sure it's not all that different from Terraform w/ modules, though my main problem with Terraform is that it doesn't constrain you into a "do the right thing" path, it's too easy to create a mess.
Anyway, IMO these kinds of differences really aren't hard to handle and it's valid to prioritize cloud-independent configuration if that's what you want/need. It allows the main configuration and installable software to be cloud-independent, dramatically easing testing. We're seeing this drive with the flocking to Kubernetes which enables cloud-independent networking, storage, and compute.
I think differing opinions on this are normal/fine, but i have to wonder why the single-cloud proponents use words like "snake oil" as if to completely discredit a different set of priorities.
Because either a) you are constrained to the lowest common denominators or b) the abstraction is something trivial like syntax and you still need to maintain two codebases (this is the problem with Terraform)
The thing is that using a cloud as a glorified hypervisor for home-baked VMs doesn’t work economically. Colo or even running a DC is cheaper. Cloud generally only makes financial sense if you are using the managed services. And of course the instant you do that you sacrifice portability to a greater or lesser extent. This is of course deliberate on the part of the cloud providers.
It makes sense to pick 2 clouds and go all-in on them using their native and managed services to the maximum. You will need to maintain two parallel skillsets to do this. Completely forget about any layer that promises to abstract it, they are all red herrings if not outright snake oil. Whether that’s IBM consultants, or Terraform.