Hacker News new | past | comments | ask | show | jobs | submit login
Is OpenStack fighting a lost battle? (memooo.ooo)
63 points by __warlord__ on Oct 20, 2022 | hide | past | favorite | 73 comments



OpenStack started out with AWS EC2 API compatibility. I believe you still have the option to run that. Also the terraform provider is actually quite good, so I don't really think those are the reasons its failed. Here are some reasons it lost IMO:

- It doesn't follow the "normal" OSS process of submitting pull requests. Their CI/Repos/Bug tracking systems are different enough that a lot of people probably don't bother trying to contribute

- The docs are not great and sometimes do not get updated.

- You basically have to run the control plan in containers to make it realistic to deploy. Kolla-Ansible and Openstack Helm are pretty good here

- The APIs are pretty slow since they're all in python

- You also have to deploy Ceph to get storage, which is itself an entire beast


> The APIs are pretty slow since they're all in python

That and that everything is so goddamn brittle. It's incredibly hard to debug Openstack because there are soooooooo many different components involved in simply hosting a couple KVM-based VMs, and the fact that I tried that right during the Python 2->3 conversion didn't help either.

Also, everyone of the myriad subcomponents needs their own database, their own user accounts, there are no shared configuration files. I get why it is so highly modularized (because of high-availability demand for each component), but FFS they could provide something like "kubeadm" that does all the tiny steps involved into setting up a cluster and joining a new node for you.

Once you get it running, it is pretty much smooth sailing until upgrade time, at which it gets really nasty yet again.

> The docs are not great and sometimes do not get updated.

One cannot help but feel that the docs sometimes haven't even been tested in real life.


So the main methods of deploying openstack right now are Openstack-Ansible and Openstack Kolla, Both use Ansible to deploy everything and get the cluster up, but you /really/ have to dig into how things are setup if you are going to get a complicated install online. adding a new node into Kolla isn't too bad but can lead to the stack exploding at times. Upgrading is a huge PITA and I do opt to just wipe and replace most of the time.


"and the fact that I tried that right during the Python 2->3 conversion didn't help either"

The huge and painful Python 2->3 transition could not have come at a worse time for OpenStack adoption. I believe it's one of the major reasons why OpenStack was found to be difficult to use, brittle, and documentation not up-to-date.


I have run multiple datacenter automation teams. Essentially providing the plumbing for provisioning bare metal.

We looked at open stack several times and never had a team large enough to justify trying to run it. It’s just way too complex and requires too many machines to get the full stack running. We could manage 5k plus machines with just 5 relatively small hosts and keeping the stack extremely simple.

Now you are going to say that we don’t provide the same features and that is true but the few cases where it would have been nice didn’t justify it.

I can write a datacenter automation system from scratch that i know won’t be causing me alerts at night faster than I can get a resilient openstack setup running. Unless that changed why would I use it.


Large bare metal installations are one of those few places where it often makes sense to roll your own for large swaths of the stack.


That is true and why i tried to stay working in that area so long. AWS eventually ate my lunch so I have switched to a pure SWE role now and just happen to setup a lot of the cloud infra as well.

I’m hoping some time I can work on DC automation again though. Brings me a lot of joy to get to figure out why the brand new NICs are eating LLDP packets before they get to the hosts and other fun problems you never get to see in the cloud.


I am interested by your expertise as we'll soon be building data center automation and would appreciate having a consultant being able to advise us on what we'd need to plan for. Could you please contact me at this address: 7vmq5a5fd@mozmail.com


"You also have to deploy Ceph to get storage, which is itself an entire beast"

You do not have to use Ceph, although it is the most popular storage backend for OpenStack. Cinder is the OpenStack project that's used to provision block storage for OpenStack. Look at the list of Cinder drivers [0] to see what the storage options are. I do agree with this part: "is itself an entire beast"!

[0] https://docs.openstack.org/cinder/latest/reference/support-m...


Many of those drivers either don't work or are shims for very expensive proprietary storage. If you want to run an actual open stack, Ceph was pretty much the option for a long time.


As someone who is running a rather large Openstack install (3500~ VMs) I agree with the statements on what needs to change, Making it easier to install would be nice, Kolla / Openstack-Anisble are pretty OK, just when they explode its really hard to figure out why it exploded, This on top of lack luster community support (Their IRC channels are /very/ dead when it comes to support) what people mean when its "consulting-ware" is that there are no real documentation on the failure states, tunables, and intermixed services. it has rather shit logs (90% of the time its just a python stacktrace shown to the end user) so you have to hit up a consultant that has had to deal with all the BS that openstack has.


IMO vendors made OpenStack that way to ensure you always need some kind of third party to manage it. Kubernetes is going the same route.


and oh boy do they charge for managing a Openstack, a 8 Node support contract from Redhat was north of 80k/yr on top of telling me that Openstack is dead, have you tried Openshift :V


Did you mean to say 'charge' instead of 'change'?


Yes, My comment has been updated.


Regarding the installation / maintenance: Are you aware about Yaook (https://yaook.cloud/)?


https://gitlab.com/yaook/operator

> Deploys OpenStack services on top of a Kubernetes cluster.

Fascinating. Although surfing around their GL org it seems they're still very early in that process


I also started (non-trivially) contributing to OSS with OpenStack, and later switched to contributing to Kubernetes. I whole-heartedly agree with the points made here; these same observations really shaped how I chose to contribute to Kubernetes.

Trying to get all the AWS early adopters to abandon their investment and move to something else seemed like it was a huge barrier to adoption, so I started off contributing to the AWS support for kubernetes and making sure that kubernetes worked well there. But the pains of installing OpenStack from upstream were fresh in my mind, so I also contributed to make sure that kube-up worked well and then (as we outgrew that architecture) started the kOps project. My goal is that you should always be able to run OSS kubernetes from upstream, even if you just treat that as an insurance policy that gives you the confidence to use a managed service.

I think the miracle of kubernetes is that I was just one of a large number of people here, each of us bringing our past experiences to make kubernetes better in some way. And the community continues to grow with people each addressing their own painpoints, which does create a lot of churn/progress (depending on your perspective!) I'd say that the people that organized the community were the unsung heroes here, and I suspect they learned a lot from OpenStack as well.


> I started off contributing to the AWS support for kubernetes and making sure that kubernetes worked well there

In effect providing free work to amazon.


It is kind of the point with open software that, when you scratch an itch, others also benefit.


A small trip down to memory lane:

My first OpenStack install was on a bunch of VMs. Got that to run following the official documentation. Got enough people interested in it, that I was "allowed" to salvage whatever hardware I could find and put in a Rack.

Getting Neutron to work was a head first intro to networks and in the end I had to learn how to configure actual network hardware.

After adding Ceph on top of old spinning disk, watching them die one after the other and the cluster was still up with performance not worst than the IBM-Stuff we were paying for gave me warm-fuzzy feelings.

Thanks to Triple-O, Fuel and most of the other "installers" at that time I learned that - the "big guys" don't know what they are doing either ;-)

At the end of the day, it took the "mystical" out of the cloud, and for Junior me that was good - it also saved me from the path management wanted me to go: Sharepoint developer/admin.


The author is asking for OpenStack to "Implement transparent APIs from AWS, Azure, GCP for OpenStack"

There _was_ an EC2 API that OpenStack nova had back in the day, that became unmaintained and was eventually removed

http://cloudscaling.com/blog/openstack/the-future-of-opensta...

This blog was from 2015, and the end result was that nobody was willing to step up and do the hard work of creating these API compatibility layers.

Frankly, there were barely enough resources being put in by large companies, to keep the OpenStack APIs themselves well maintained.

So, I think the issue of OpenStack was that everyone suddenly decided to shift to containers, instead of VMs for their deployments, and OpenStack no longer was required.


OpenStack tried to become a container scheduler (Magnum), and then shifted to making kubernetes a first class citizen, it wasn't well received.

Oddly enough some of the kubernetes bare metal stuff is borrowing from OpenStack (Metal3 is built on ironic)


The ironic stuff is some of the best services within OpenStack. Its a hard problem and it works surprisingly well.


Thanks! Ironic (ironicbaremetal.org) can be operated standalone outside of Ironic, and we do pay attention to non-OpenStack use cases (e.g. https://metal3.io/). If there's anything you're trying to do with Ironic that you're having trouble with, feel free to come chat. We're on IRC/Matrix on irc.oftc.net, #openstack-ironic.

-Jay, Ironic PTL


'Ironic' is the OpenStack project that deals with bare-metal provisioning. And there is indeed some very cool stuff there!


I remember when Magnum was being proposed and developed. I think the issue is that it really didn't have that much of a head start on Kubernetes, and the difficulty of getting OpenStack installed, just so that you could use Magnum, compared to just using Kubernetes, sort of doomed Magnum.


I think this misses a much larger issue, which is that running your own datacenters (or just your own hardware) is becoming very unpopular. OpenStack could have all the features in the world, but at the end of the day you need a team to run it.

Having worked with large footprints on both OpenStack and AWS, it’s also clear that there are just inherent difficulties with running your own hardware, especially in your own data center. Even if you make the investment in a good infra team, it’s cost prohibitive to get anywhere near the experience of something like EC2 in terms of hardware availability and hourly billing. Not to mention they’re literally inventing their own hardware for things like high-performance storage and ARM servers.


May be I'm missing an elephant. Right now I'm client of hoster. This hoster provides openstack API to create VMs, disks, etc. I'm using that API to provision my infrastructure including kubernetes.

What is supposed replacement for OpenStack? Some proprietary API glued with PHP and bash scripts? Who will write terraform provider for that API? Not me, not hoster for sure.

Am I supposed to leave that small hoster and move to AWS? We don't have AWS data center in my country and I have obligations to keep data inside borders (that's not even saying that AWS like 5-10x more cost).

The only realistic alternative to OpenStack is to throw away terraform and infra gitops and do everything manually. That's huge step backwards.


VMware?


Ha-ha, they actually provide VMware as an alternative. But its price like 2x more which I don't understand, so I never tried it. That could work, I guess.


> I think this misses a much larger issue, which is that running your own datacenters (or just your own hardware) is becoming very unpopular.

Except that the commercial cloud is exorbitantly expensive, especially for persistent (i.e., always on) services. If your business has already invested in brick and mortar infrastructure, networks, sys admins etc. having an on-prem OpenStack cloud looks like a more appealing prospect especially for big projects and medium/large companies. Having employed OpenStack for a number of years, it is true that it does not always feel like a first class citizen. YMMV.


There's a large enterprise push literally going on right now (and growing) to move towards hybrid cloud.

To be honest I have no idea who the winners will be but as a user of OpenStack I can say for sure that that won't be it.

Kubernetes and to a lesser degree Nomad are diverting orgs in a different direction. I would say that Nomad is probably far easier/better for most orgs.

I've also seen a number of teams keen on buying their hardware from Oxide.

It's a wild world out there right now.


This is definitely truth to this. Running your own infrastructure is not popular, much less "flashy". It is capital intensive. Some businesses explicitly avoid capital investment worsened by some localities having "fun" tax/accounting rules for capital assets. At some point, using an external service provider is a better tradeoff for the business decisions. The other major aspect is few basically run small scale infrastructure like people once did even just a decade ago, so the base knowledge and opportunity to learn is not what it once was, which is an overall technology industry conundrum.


Do you worry that given those trends, at some point AWS (or at least a tech oligopoly) will own all significant internet infrastructure? I do.


Yes but when you're trying to build and maintain a product, ownership of the internet's infrastructure is low down on your list of problems.

It's similar to the fact Microsoft owned desktop computing for many years. Yes, it was a problem, but the people purchasing Windows and Office weren't interested in solving it. They just want to do their jobs.


Ma Bell used to be the only phone company and arguably did a better job than since the phone industry was deregulated...except maybe for introducing new products.

(Slightly biased former Telco employee opinion)


I was just reading comments on HN that said companies are starting to realize the cloud is too expensive and moving back to DC's


Could be a concern if hosting becomes impractical for normal people much like how email is basically impossible to host yourself.


I was very interested in Openstack for a homelab. I tried to wrap my head around it several times. It was just too difficult. Too difficult to install. Too difficult to understand the concept with the given documentation and resources.

With K8s you have all those opinionated distributions like K3s, minikube etc. which lower the bar considerably. There are countless guides on YouTube.


Kubernetes as an abstraction layer above clouds, whereas OpenStack is a cloud computing environment.

Though, I heartily agree with "it's easy to get started with Kubernetes, and difficult to get started with AWS".

When I tried looking at deploying an application to OpenStack (as an alternative to deploying to AWS), I was barely able to setup an OpenStack cloud environment on my computer. The most practical option was to use a public cloud which provided OpenStack APIs.

Even with public clouds with OpenStack APIs.. my experience trying "compute VM with an internet-facing load-balancer in front" was that different public clouds required different setup for this. -- TBH, this surprised me. I understand that you can't just copy-paste Terraform code for AWS and have it work for GCP; but I had expected to be able to copy-paste Terraform code for one OpenStack cloud and have it work with another OpenStack cloud.


For your "local" deployment of openstack, look into "Devstack" -- For the cross openstack terraform, the only main issue I ran into was images for the VMs you are going to use are going to differ a bunch, same with networking. as everyone has their own take on how networking should work in their openstack.


The reason the network setups do not really match between public OpenStack cloud providers is that most of them significantly change that part of OpenStack. My knowledge is very outdated but I remember that the networking part had significant scalability issues that pushed every provider to introduce their secret sauce.


Yep, a "stock" openstack install will tend to use OVN or VLANs -- Most public clouds throw that out completely since they are going to use their existing methods for networking. its a shame, I've not found a public cloud that supports something like OVN.


OneQode does.


I have deployed OpenStack into production environments and it always takes a considerable amount of time and effort. Once up and running, however, it is awesome. I think its lack of success in gaining more market share is that documentation is not good, and the project lacks evangelists. Never underestimate marketing.


I was looking at it for a small VMWare cluster replacement (3 servers or so, couple of dozen VMs, but should grow larger over time) for internal use at our small electronics company, but it sounds too complex. Proxmox looks easier (and probably more suited to this tiny scale?) so I’ll probably try that.


Proxmox has also web UI for creating a ceph cluster, very good upgrade documentation (went from PVE 6 ceph 15 to PVE 7 ceph 16 without issue), and paying support if you need some.

And don't forget to add Proxmox Backup Server to your list, live restore is a fantastic feature of it :)


We used oVirt since 7 years. About 550 VMs (Linux and Windows Server). Works really fine.


Yes, if you are just trying to replace vSphere/ESXi Proxmox is pretty good on that scale. Openstack is great if you are trying to run a internal Digital Ocean style service.


We went down the Openstack path long ago.. but ended up choosing Triton DataCenter. The simplicity in operational support is just amazing. We’ve since acquired the commercial support business from Joyent this year for Triton.

Triton is much simpler to operate, manage and support - and we’ve never looked back!


Compare https://www.stackalytics.io/?metric=commits&release=folsom from 10 years ago with https://www.stackalytics.io/?metric=commits&release=zed from this month. Notice that 2012's Folsom release was mostly made by OpenStack providers while 2022's Zed release was mostly made by Red Hat and OpenStack consumers. (might help to expand to "All").

OpenStack did find some markets, enough to sustain a couple of vendors, but it didn't replace AWS, but it's also not trying to replace AWS anymore, it's trying to continue to serve the users it did find.

Shifting more towards my personal opinion: Containers are, in most cases, a more sensible unit of software deployment than VMs are, so OpenStack was always going to "lose" to something built around containers.

Shifting more towards other peoples' opinions: Severless is, in most cases, a more sensible unit of software deployment than containers are, so k8s will either be supplanted or become uninteresting infrastructure at some point.


What is a serverless deployment unit? We already have Fargate, so no need to choose between serverless and k8s.


I don't understand why containers are not serverless deployment unit. Run container, send http request to it, shutdown if necessary.


I'll start by repeating that the serverless part of my comment was me extending the things that other people are predicting for the future. I'm not 100% I agree, but I'll continue their position:

containers are only really better than VMs because they use fewer resources. They are not better than VMs in that you still have a whole software stack inside them that you have to care about. Anyone building their own containers should also be maintaining a whole pipeline for updating all of the software in the containers that isn't their application.

Narrowing the unit of software deployment down to the application itself, removes a lot of that (or rather abstracts it to whomever is responsible for maintaining the serverless runtime containers you use).


    Rewrite it in Rust. /s
    Implement transparent APIs from AWS, Azure, GCP for OpenStack, so we can reuse
    Pulumi and Terraform.
    Make it easier to install it.
    Make it easier to work with Prometheus stacks, Service mesh and other cloud
    native tools.
    Make it less consulting-ware… whatever that means.
Uff. How about to make OpenStack manageable by small infra team without paying to vendors? I guess it's the last point.


Open stack lost because it doesn’t do the job.

I’ve used it at two clients and it lacks critical features. At least with k8s you can bolt on the pieces you need.


Exactly this.. Kubernetes provided some realizable value from the very first release.

OpenStack has been around for 10+ years and mostly just provided slide-ware value and maybe some strategic negotiating leverage and insurance of various forms.

It’s actually interesting to me to hear from the sibling comment, that someone actually got to 1000s of VMs under management. I’m still not convinced it’s necessary or adding value in such situations vs “the field”.


> OpenStack has been around for 10+ years and mostly just provided slide-ware value and maybe some strategic negotiating leverage and insurance of various forms.

There isn't much competition in the market and most of it is closed source, proprietary stuff - OpenStack in contrast is open source and made to fit the needs of CERN (they run 320.000 nodes). That maybe also explains the choice of Python and the sorry state of its documentation - scientists have ample time to deal with bullshit, corporate does not.

[1] https://www.openstack.org/videos/summits/berlin-2018/towards...


So, CERN doesn't run that many machines. They do have that many cores across their OpenStack fleet. I actually think it is even more than that now since they've tweeted graphs[1] from adding 800+ new physical nodes to their clusters. Anyway, it is not all about CERN, but they are an active community contributor. They are an operator who speaks publicly of their work, challenges, and successes. They have championed fixing some issues they have encountered at their scale, and provided feedback to the various projects on aspects which would help them solve their infrastructure automation issues. The same has occurred with the telecoms and other verticals with specific needs that cannot use public clouds.

As for docs, they vary from project to project, but one aspect I've encountered as a frustration and heard large scale operators echo, is that often search index results seem to favor some older specific release as opposed to the latest release, which creates a lot of confusion if your not aware of it and don't see the warning on the rendered webpage. Some other popular open source projects have encountered the same exact challenge, and it is a hard need to balance.

[1] https://twitter.com/ArneWiebalck/status/1478778715005984772


China Mobile Cloud is built on openstack and has 1M+ cpu's at last count.


Overall.. I am just running core services on openstack, I do like its project isolation. This allows me to give a account to a student group and not worry about them. this comes almost free with openstack. vSphere and Proxmox really don't do this well. the easisted to use deployment of openstack is "Kolla" and it really isn't maintaining anything outside of core services. its a real crapshoot if any of the "cloud services" will deploy at all or just explode your stack.


I see Openstack as a replacement for running your own Digital Ocean / Linode, I've yet to find a product that allows me to run fat VMs, with any kind of networking I really want (I use the crap out of OVN and duplicate network blocks / isolation) and I can just yeet a project at a student and not worry about them going over quota with it.


> Is it because is “difficult” to install?

Yes ... but somehow Kubernetes has an even more confusing and difficult installation story (and don't get me started on OpenShift).


I wouldn’t say that “gcloud container clusters create” is too complicated. One command and you’re up and running. I agree OpenShift is an entirely different kettle of fish


I mean installing it, not running it in someone's cloud.


Does anyone remember eucalyptus ? what happened to them?


They rebranded: https://github.com/corymbia/eucalyptus#readme but development is either happening "offline" or they've stalled development because https://github.com/Corymbia/eucalyptus-blog/blob/master/cont... is the most recent blog update

It's also a beast to even compile and package, to say nothing of actually deploy


~5 years ago, I had a lunch interview with a candidate from IBM working on OpenStack. He complained it took forever to pass all integration tests for any changes. The ecosystems were so fragmented, and several large vendors were participating, and each had its interests.

I knew it was doomed.


The complexity of running OpenStack is so large that running your own Hypervisor looked simple in comparison.


OpenStack feels like a promising prototype that accidently made it into production.

I think HashiStack is a much more promising "v1" but not necessarily the "Linux of Cloud OS" yet.


OpenStack was done almost a decade ago. I was involved in the early days, it was abandoned by its founding entities, and IBM ruined it further. I didn't realize anyone was still talking about it.


lol i'm not sure it was ever really a debate




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: