Hacker News new | past | comments | ask | show | jobs | submit login
Wikimedia Gitlab Migration Status (mediawiki.org)
44 points by altilunium 67 days ago | hide | past | favorite | 30 comments



Gitlab minimum viable product policy makes it less and less attractive among other CI/CD/SecOps tools(and this makes me sad because I loved it back in the days). Instead they focusing on AI when the rest of features are unfinished. For example lately I wanted to implement deployment approval flow for ours crucial repositories but stumbled onto nasty bug. I've created a support request as we are on premium plan. The agent pointed me to 2-years old issues and closed the ticket. This is how its done there


So many 2-5 year old issues. The technical debt for Gitlab CI is not being addressed, it keeps piling up. It seems the backlog of bugs grows every year.


I haven’t found a CI that I like more than Gitlab’s, what are your major problems with it?


Caveat: I haven't used Gitlab in a couple of years, but before that, part of my role was to set up Gitlab CI systems for different projects, so I got a lot of hands-on experience back then.

Gitlab CI is 3-4 different systems on top of each other wearing a trenchcoat. Every so often, they realise they can do CI better and come up with a new syntax for everything (see stages, which got extended by dependencies, which got superceded by the needs mechanism, or only/except which got replaced by rules). Obviously, they can't easily remove the old syntax, so pipelines very quickly become a mix of different mechanisms, some old, some new, with weird, unpredictable, and usually poorly documented interactions between them. You can try and only use a restricted subset of well-defined Gitlab CI, but part of the problem with modern DevOps is that most developers are not very experienced with the CI syntax, and will just find snippets on SO that do what they want and leave it there.

Apart from that, the documentation overall tends to be very poor, and the implementations are often buggy or missing functionality (a lot of searches for "Gitlab CI <thing I want to do>" would just link to an open ticket in the Gitlab repo describing exactly the functionality I needed, and dozens of comments going "yeah, this is necessary" and "a silver level customer needs this functionality to continue using Gitlab"). There's also lots of stuff that relies very heavily on being and to configure everything in YAML. other CI systems also have this problem, but often provide mechanisms to write individual tasks in other languages so you distribute, say, the "deploy this Docker image to our k8s cluster" task as a standalone unit written, tested, and reviewed in a real programming language. Gitlab CI has a very basic version of this feature, except it's still all in YAML, it's almost impossible to test, and it's very difficult to configure these tasks at all.

Fwiw, I've done plenty with Gitlab CI, and it's not like it doesn't work. It is perfectly fine if you don't have particularly complex needs, and the runners mechanism seems fairly easy for system administrators to get started with and get working. But overall, the whole system feels only half thought-through, and trying to do anything complex tends to require a lot of hair-pulling and confusion.


> other CI systems also have this problem, but often provide mechanisms to write individual tasks in other languages

GitLab CI allows to run a bunch of commands in any Docker image you specify. The Docker image and/or your scripts may be self-written.

How do you find, for instance, GitHub Actions more advanced?


> How do you find, for instance, GitHub Actions more advanced?

Being able to easily tap into an ecosystem of existing re-usable functionality (actions) is a great and pretty advanced feature that requires a very different (and more advanced) set of abstractions than running “a bunch of commands in a docker image you specify”.

Not that it’s a perfect system, but a core CI system with a decoupled layer of “things that run on that CI system” is a great model.

For example, the core of GitHub actions doesn’t have anything built-in that clones repositories. That’s a first-party action (component) that GitHub develops, releases and evolves independently. But you can roll your own if you want.


Not sure if you're aware, but Gitlab has had cross-project 'includes' for a long time, and publishes a big chunk of templates for these on gitlab.com.

They've also introduced Components a year?(don't hold me to that) ago, which is more akin to the GitHub actions model.

https://docs.gitlab.com/ee/ci/components/index.html


While components look great (and I wasn’t aware!), it’s still just the same old “template-metaprogramming-with-yaml” that includes are:

> Avoid using global keywords in a component. Using these keywords in a component affects all jobs in a pipeline, including jobs directly defined in the main .gitlab-ci.yml or in other included components.

This is very different (and a whole lot less advanced) than being able to run a 3rd party GitHub action written in JavaScript, alongside another running in its own container image, mixed in with your own steps.

Because Gitlab CI bakes everything into one layer it’s much much harder to evolve and extend, and so it fossilizes whilst being unable to shake it’s baggage.


FWIW, I always found GitLab's CI to be mostly a Travis clone with a few improvements. CircleCI started in a similar place but seemed to improve faster for a few years. Eventually I moved everything to GitHub Actions which always felt mature beyond its age and I've never looked back.

Roughly my issues with GitLab CI was that it didn't provide sufficient primitives for a) breaking up the build into a granular build graph, b) correctly passing artifacts between build graph stages (caching wasn't a good solution), and c) guaranteeing at-most-one task in a critical section (such as a deployment). It has been a few years since I last checked on these, but it also sounds like the product hasn't move much recently.


"Backlog of bugs grow every year" - that'll be written on the postmortem of "what happened to GitLab?"


> Instead they focusing on AI when the rest of features are unfinished.

That’s extremely common now unfortunately.


The key bits:

> we believe that we should stop migrating all repositories unconditionally, and instead keep our two systems: Gerrit and GitLab. Gerrit should remain for the use-case of deeply connected repositories. GitLab should remain for tools, analytics and machine learning, and services.

> GitLab’s missing features are necessary for the productivity of developers, deployment safety, and operational reliability.

> There’s a demand for code hosting outside of Gerrit. Wikimedia Foundation-hosted GitLab has been a boon for these users—tool creators, engineers focused on data and analytics, and folks building services.

(of course, read TFA for the finer details and rationale)


> Without a way to coordinate merges cross-repository, we would see more CI failures and broken builds due to semantically incompatible patches landing on the mainline branch.

Isn't this rather a symptom of code organization or architecture issues? If one needs to coordinate merges of individual parts, doesn't it mean they belong together? Monorepos or clearer versioning of the individual parts would enable independent merges.


Perhaps but it would require a huge amount of refactoring and reorganizing literally thousands of git repositories. MediaWiki is host to a vast number of extensions which are variously developed by teams within the Wikimedia Foundation as well as individual outside contributors. Each extension might have totally different development process and release schedule (or almost no process/schedule at all)


Regarding their point on stacked diffs[0]: they're such a great feature imo and it's a shame GitHub etc don't support the functionality better.

Often I'm working on a set of changes that ultimately have a large code diff. I don't think huge PRs are often a great idea as they introduce more risk, so it's often a good idea to break it up. But a reviewer is not always available to review each bite-sized PR, so you end up with a backlog of PRs that are ugly to review in the GitHub UI and you need to continually rebase them as each is merged.

I quite liked the graphite.dev workflow for this, but it's a bit pricey. That and it only seems to work well if you can get your whole organisation to buy into using it. If GitHub etc integrated it as a native feature, I think that'd be great.

[0]: As mentioned in the article, and https://newsletter.pragmaticengineer.com/p/stacked-diffs


Some background that I am familiar with because I worked on the team that maintains Wikimedia's Gerrit, GitLab, Phabricator, CI and deployment systems. I left in 2022, however, by then the GitLab migration was well underway.

As far as I remember, and from what I observed, the decision to adopt GitLab was meant to better cater to newcomers and volunteers who generally do not appreciate Gerrit and saw it as a serious barrier to engaging with the Wikimedia software development ecosystem. Gerrit has a pretty steep learning curve and the web interface is pretty ugly (Subjective, but this is an opinion shared by many.) We got quite a bit of feedback that Gerrit was a stumbling block for new contributors as well new hires on the Product and Engineering teams. Many folks who have used Gerrit for a long time learn to love it but newcomers either hated it or found it difficult to adjust to.

So to summarize the main arguments for GitLab (as apposed to "just use github" or various other alternatives which were considered):

* It's ostensibly open * It's similar to GitHub in most ways that matter * The GitLab CI system is configured in the repo and it's entirely self-service, as apposed to the mess that is Gerrit + Jenkins + Zuul CI. Zuul requires a lot of specialty expertise to configure and maintain, and that places control of CI largely out of the hands of the people maintaining each repo. Self serve is better for the needs of many if not most developers. * Last but certainly not least, there was a fairly wide-spread fear that Microsoft would ruin GitHub, along with and a strong preference for self-hosted free software tools in keeping with https://foundation.wikimedia.org/wiki/Resolution:Wikimedia_F...


Does it say anywhere why they chose Gitlab? I used to be a big fan of them, but they kind of died for me after they implemented big limitations on the free tier. I stopped being able to Trojan-horse it to my company, and Github got good CI and other nice features, so I've switched to that.


It's not clear if Mediawiki are self-hosting gitlab. If they're relying on Gitlab (the company) then I'd be worried about Gitlab running out of money and going out of business. I had a bit of an interaction with one of their sales people a while back and they appeared desperate. If self-hosting, there would be a bit of uncertainty while development reorganizes but I don't expect any long term problem.


They're self-hosting at https://gitlab.wikimedia.org/explore


I don’t think that Wikimedia is overly concerned about the…free tier limitations that soured you on them in the first place.


Definitely not, since I'm the one who gets asked for money to pay for those tiers all the time.

The two parts of my comment were unrelated, though.


Github is my preferred too. However I imagine it is because of how popular GitLab was in 2022. Pivoting mid migration would suck.


I thought Gitlab put more stuff in the free tier after KDE made that a requirement for moving to Gitlab?


Just setup GitLab yesterday on my homelab server (containerized), it's a Rails beast and takes a while to start but the product they give away for free seems very polished. I looked at Gitea too before choosing GitLab, mainly for their CI.


Have you tried Gitea Actions? https://docs.gitea.com/usage/actions/overview


I have not, just read that it's a bit fresh at this point. I've not yet invested anything in my GitLab setup and will try Gitea actions before deciding. Are you running Gitea locally? How was it for you so far? Thanks

> "Gitea Actions is still under development, so there may be some bugs and missing features. And breaking changes may be made before it's stable (v1.20 or later)."


GitLab was so great in the early days. But they started chasing after the enterprise market and stopped innovating. They dropped their cheapest paid plan, and features get introduced that no one wants and bugs don't get fixed.

Maybe this publicity will get them to rethink their strategy, but I doubt it.


To properly run gitlab with large codebase you must have a dedicated devops team to maintain extensive set of webhooks and other linked services. Even in top-tier enterprise edition.


Gerrit is like old school git, the flexibility of github and gitlab far exceed gerrit's use in my opinion.


GitHub and GitLab are more polished and support different workflows, but the Gerrit workflow is really quite specific and is mostly unsupported by the regular GitHub/Lab tooling.

I moved from a company using GitHub (and a decade of GitHub based open source experience) to a company using a Gerrit-like workflow, and I can see why people would be resistant to a move. On the surface the options look quite similar, but "at scale" I can see Gerrit working quite well and missing certain equivalents that would make a transition hard.

That's not to say GitHub/Lab aren't better, or that a move might not be worth it for other reasons, but a transition that pushes people out of their comfort zone and reduces their productivity for unproven, potential, future gains, is a hard one to justify.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: