This is why you need to have somebody learn Unix and figure out how to run thing...

bityard · on June 7, 2022

There's a middle-ground here, which is far more reasonable and seems to be a critical missing piece of OP's company: Disaster recovery. Ask, what's the minimum we need to do to keep the business running when our provider(s) disappear or go down for an extended period of time? And then implement those steps.

When you pay someone else to handle your data, there is a lot that can go wrong. GitHub could go down, they could lose (or corrupt) your repos, they could accidentally delete your account. The nice thing about git is that it's absolutely trivial to clone repositories. There is _zero_ reason not to have a machine or VPS _somewhere_ that does nightly pulls of all of your repos. When Github goes down, you'll lose a lot of functionality, but at least you have access to the code and can continue working on the most urgent things.

I'm not clear on the details but OP's issue seemed to be around a broken CI system. At its heart, CI is just the automatic execution of arbitrary commands. Every repo (or project consisting of multiple repos) _should_ have documentation for building/testing/deploying code outside of whatever your CI system is. If your source of truth for how to use your code is in the CI system itself, then your documentation is very lacking and yes, you are susceptible to outages like these.

deathanatos · on June 7, 2022

CI build processes often require credentials, sometimes ones that are in some sort of twilight-zone¹ to devs. IIRC, Github doesn't provide a straight-forward way to clone those credentials.

¹e.g., the devs "don't" have access to the credentials, except they're in the CI workflow, so technically they do. But I've worked at a number of companies where security will happily bury their head in the sand on that point.

a1369209993 · on June 7, 2022

To the extent that that works, it's not really a middle ground, so much as choosing "run it locally" for the things you really need, and "kick back and relax" for the nice-to-have but nonessential stuff.

Because if the things you really need actually keep running when your provider(s) disappear or go down for an extended period of time, you're running them locally anyway, and might as well get the benefits of that effort all the time.

giancarlostoro · on June 7, 2022

Heck, your entire host could go down, what do you then? If it's not a major infrastructure provider like Azure, AWS, or GCP (I forget if that's what they call theirs) then you're kind of SOL. Outages can and will happen, the question is, how bad is the next one? If they are too frequent, you have to evaluate if it's really the provider or your application, if it's the former, you may want to consider a new host, or get with your hosting provider and get them to figure out why you.

cdkmoose · on June 7, 2022

Outages will happen with them also, https://status.cloud.google.com/incidents/YrjzRWPFBUZU5HJZ4m...

zamalek · on June 7, 2022

Our contract is up for renewal in a few days, so we'd likely struggle to pull off self-hosted in that time. Self-hosting has been the watercooler talk, but we're currently migrating away from Circle CI (to GitHub <facepalm>), so it was on the cards for next year.

number6 · on June 7, 2022

You might find Gitlab as a selfhosted Service interesting.

Spin up a Ubuntu machine and install the Omnibus and you have the basic functionality running in about half an hour, plus another half an hour for the CI Runner.

Xylakant · on June 7, 2022

Setting it up is the smallest part of any competent operation. Capacity planning, monitoring, planning for outages and recovery, updates, backups - including testing for recovery -,… is what takes the effort. Setup is almost always on the happy path. When things go wrong, you start over. But once you have a substantial investment in terms of data stored, that is unlikely to be an option. Now you need to figure out and solve the actual problem, which likely requires intimate knowledge of how the system works. Nobody acquires that knowledge by browsing the quick start installer docs.

number6 · on June 7, 2022

It's already not done by Github. Invest the time and you will be on a better path.

Xylakant · on June 7, 2022

I used to do OPs for a living, so I’m a bit aware of the tradeoff involved - and for me, the math doesn’t work out. We’d need three or four people with knowledge of the setup - I can’t be the only one since I want to go on vacation from time to time. Others have the same right. Sometimes people fall ill or are unavailable. We could scrape together enough folks and train them, but it’s honestly not something I want to track and invest time in. GitHub had its outages, but none that would affect us massively.

If you already have internal infrastructure and a moderately competent operations team for that infrastructure, the calculus for you may be different. Blindly assuming that I’m wrong is not a sign that you’re aware of the tradeoffs.

Tehchops · on June 7, 2022

Your logic is unfortunately being cast into the reason-devoid abyss of HN commenters consistently overestimating the value of the lone wolf, "competent" Linux admin.

Say you don't understand opportunity cost in software development without saying it.

I won't delve too deeply on the obvious: most "competent Linux sysadmins" have a very over-inflated sense of their own skill set, and tend make for toxic team members.

Most software development shops are in the business of developing their particular software, not deploying and self-managing DVCS, much less hosting, monitoring etc...

Sure, could one person set up a Git/GitLab system? Absolutely. Can they operationalize it effectively? Not really... the bus problem is a thing and anyone that thinks tying the entirety of a system's uptime to one individual is an operational improvement over GitHub's outage SLA is deluding themselves.

exdsq · on June 7, 2022

Just joined a company that self hosts Gitlab (and everything else, there's zero cloud) but is 100% remote. So far everything has been seamless and there's a large enough infra team to solve these issues if they arise :)

Xylakant · on June 7, 2022

As I wrote, it's a matter of what you focus on - I've run CI systems and internal git hosts for large organizations as part of my work. There are very valid reasons to do so, but cost alone is rarely a compelling one - an on-site enterprise gitlab license is roughly as expensive as hosted github seat, and the community edition is somewhat limited.

And it's definitely possible to run gitlab or any other git hosting solution on-site with little downtime. There's no magic or arcane knowledge involved. It just takes serious effort to do so - more than a single lone wolf sysadmin can provide. All their skills are worth nothing if they're sick and in hospital or on a beach holiday.

number6 · on June 7, 2022

At some point you will need a pack of fierce sysadmins, not the lone wolf, as dangerous he might be. If you forget to scale your ops team, you gonna have problems. Guess it's a strategy thing: do I want to rely on a third party or do I want to manage my own people and processes for this. In any case I habe to assess the risks and plan ahead.

Xylakant · on June 7, 2022

> At some point you will need a pack of fierce sysadmins, not the lone wolf, as dangerous he might be.

Maybe, but also maybe not. And then that still doesn't mean I want them to focus on running git/gitlab. I mean we're doing stuff that revolves around the rust compiler and we have operations people easily capable of running gitlab around, but their primary task is something else - they're building systems on top of that. Do I want to re-task - or even just side-track them - into running gitlab?

Once you reach a certain size, you can have an internal ops team that's responsible for providing internal infrastructure, but to what extend is that really different from giving github/gitlab money? They'll be about as far removed from the individual teams they're serving as github is. Is that really something I want to put organizational effort into, distracting the org from achieving the goal? It's all tradeoffs.

midislack · on June 7, 2022

It takes like fifteen minutes to set things up. Every startup needs a competent Unix sysadmin.

EDIT: a Threadripper will do for CI. Quick as you like.

anonymousab · on June 7, 2022

GitHub is far, far more than just a git repo. Issue tracking, project boards, commit status systems/check systems, deployment tracking and monitoring, fully fledged CI and deployment pipelines (actions/workflows) written in their own flavor, etc. All sorts of webhooks, complex arrangements of teams and access controls, cross repo, cross,-org and cross enterprise account linkages. Large object storages, container registries, and package repositories. And of course, the existing context of all this stuff; setting up an alternative != completely migrating and validating everything from the original.

Replacing all that with something as scalable, flexible and agreeable with potentially thousands of global developers is far more than '15 minutes' of work. Several orders of magnitude more.

Even on the git repo question alone, if you're an enterprise of some size, you'll have hundreds or maybe thousands of repos that could be potentially gigabytes in size (for any one repo) for code alone. Moving to a self hosted solution requires far more than just throwing some threadrippers and enterprise drives at the problem. And that's assuming the best outcomes.

A competent UNIX sysadmin would be the one yelling not to throw the baby out with the bathwater here, because they would know just how hard this stuff is at scale.

noasaservice · on June 7, 2022

Again, a Linux/Unix admin is worth their weight in gold-press latinum.

Pop in a self-hosted GitLab install, configure SAML or AD auth for SSO. It's all GIT so importing all commits (and not losing history) isn't hard - just tedious.

For testing pipeline, use Selenium on a 32 core threadripper running linux, with 1/4 TB ram. You can get upwards of 400 headless chromes on that.

Throw in NodeRed for overall process automation (think: tying in disparate APIs with a low code environment).

I've done this, with exclusion of the selenium checks themselves (there was a qa team for that), in like 2 weeks.

rsstack · on June 7, 2022

150 lbs of gold-pressed latinum costs a lot more than GitHub Enterprise, and for a marginal improvement in uptime. And you need 3 of them if you want on-call, which you should if you're trying to beat GitHub's availability.

anonymousab · on June 7, 2022

It's literally not all git. Every single thing I mentioned outside of the git repository itself is not git, and makes up a significant amount of services that would require disparate, specific replacements and buy in and compatibility with all of the developers, teams and units of a company. It's a vast, extremely costly amount of work.

Just throwing up a server somewhere running git and a few software packages is nowhere near the same thing.

bogwog · on June 7, 2022

Gitlab has all those non-git features, and then some. Migrating to it from Github might not be easy, but it's definitely worth it IMO to invest in the product that gives you more options, instead of getting locked in to something like Github.

exikyut · on June 7, 2022

Some deployments might be able to leverage GitHub Enterprise, which is a $231/user/year GitHub-in-a-VM-image. It's pretty much the GH source code (running through a modified copy of Ruby so the on-disk .rb files are scrambled).

https://github.com/pricing

https://docs.github.com/en/enterprise-server@3.2/admin

Active/passive HA is possible: https://docs.github.com/en/enterprise-server@3.2/admin/enter...