Hacker News new | past | comments | ask | show | jobs | submit login
Github major service outage (status.github.com)
123 points by forlorn on June 2, 2013 | hide | past | favorite | 79 comments



I find it somewhat odd that, given git's emphasis on distributed version control, we see so many git users have centralized themselves quite heavily on GitHub.

Whenever GitHub runs into problems like this, it reminds me of when a team's CVS or Subversion servers used to go down. It could be a pretty disruptive occurrence, if it wasn't resolved quickly. While git can theoretically handle this better, in practice the use of GitHub, for instance, renders git just as susceptible.


How is it at all odd? Github offers a convenient platform for using git. People use it.

If Github were to explode forever tomorrow, active projects would just take their locally cloned repositories, and put them online somewhere else and carry on committing (albeit sans github's awesome social tools). That's the real power git offers us.

It's just a fact of reality that most projects centrally organize through a few bottlenecks. It's easier for people to remember where to go, if there's only one place to go to. But if you're really concerned about it, i suppose you could build a tool to automatically sync a github repo w/ another 3rd party host if you wanted (or visaversa).


Yes, but Github is not just about Git hosting, it's about all those awesome social tools. If you use Issues as the main bug tracker, for example, you've got a problem.


Please point me at your distributed issue tracker project. I'd love to contribute!


At KDE we use Bugzilla with automatic status emails to various mailing lists and projects.

So even if bugs.kde.org were to go down, we'd still be able to fix code at git.kde.org, and use the lists.kde.org mailing list archives to lookup bug details as a backup. We'd probably use a mailing list system that doesn't suck like kde.markmail.org or GMANE for that last part, but that's just shaving the yak...


Fossil includes distributed issue tracking: http://www.fossil-scm.org


http://bugseverywhere.org/ does distributed bug tracking.


I've often wondered if we could store issues in a separate branch.


Also http://www.onveracity.com/ offers distributed version control, and distributed tickets, and wiki.


mpyne mentions bugs.kde.org's mailings as a fallback, but GitHub issues e-mails too.


Yes, there is nothing wrong with using Github, but with using only Github as your git remote. Git (in theory) makes it easy to use multiple servers. Add some scripts/utilities and you get Github with all their "social" stuff, wikis and issue trackers plus higher availability if Github goes down; just use your other remotes.

Is there a git tool to share your remotes in a repository?

One could use a distributed issue tracker like "Bugs Everywhere"[1] or git-issues[2]. Are there ways to "sync" them with Github's issues?

[1]: http://bugseverywhere.org/ [2]: https://github.com/jwiegley/git-issues


"Add some scripts/utilities and you get Github with all their "social" stuff, wikis and issue trackers ..."

It's crazy how GitHub's entire product is so easily marginalized by comments like this. I don't know if you meant to do it, but I think it is a serious problem with hacker culture. It's the kind of thinking that tricks startups into "knowing" they can do a better job than established competitors in spaces they know close to nothing about, because they "know" they can execute better. I speak from personal experience here.

I can guarantee that no amount of "adding some scripts/utilities" will get near what GitHub's service offers.



I'd disagree, on paper Github has a nice suite of services, arguably best-in-class for some of them, but that doesn't make them untouchable. Github is only the latest in a line of code-hosting services.

Just because you tried and failed, doesn't mean every try will fail.


You may have misunderstood the comment (or I have).

I don't think they were saying that a few scripts and utilities would get you what github offers. I think they were saying with some scripts you could automatically fall over to another mirror when github goes down, then switch back when it's up (updating the repo on github when it's available again).


> Add some scripts/utilities and you get Github with all their "social" stuff, wikis and issue trackers


That was indeed badly phrased. I was meaning some scripts to manage multiple remotes and sync (sets of) them between your different team members. So you could have your code (and maybe dumps of githubs issue tracker) on your server and still use github when they are up and running.


Sometimes you need to read the entire sentence for it to make sense.

The word "add" here is important, as it's talking about having github and something else. So you get github (with all their social stuff...) AND what the scripts add, which is more reliability.


IMO Github doesn't break often enough to make effort of adding `some scripts/utilities` worthwhile.


Thats not true, with CVS/SVN you can not commit, log or do anything while the server is down. With git you can still commit to your local tree, push it to another server or send patches. While it might slow down workflows using github.com for push/pull its not that complicated to temporarily push/pull via another server.


Playing devil's advocate here... but if you were so inclined you could accomplish the same with SVN.

Creating and hosting a repo locally (or offsite) is trivially easy, and though it might not be as friendly or natural as it would be with git on a daily basis, it's not impossible either.


So if four people set up local svn servers, and then push to those servers during the outage, what happens when the outage is over?


I've already said it wouldn't be easy, so I'm not sure what your point is.


"I find it somewhat odd that, given git's emphasis on distributed version control, we see so many git users have centralized themselves quite heavily on GitHub."

It isn't odd. Almost nobody understands distributed version control, and fewer actually need it.

A big part of the reason that GitHub is popular is that git is way too complicated for most people to use. They don't understand the CLI or the concepts, and need a shiny web UI to abstract the (admittedly terrible) interface.

Most people need branching, merging and committing to a canonical repository. Even the groups that could theoretically benefit from distributed repos (larger, far-flung organizations), tend to be hamstrung by the complexity that they introduce, and therefore in practice tend set up a small number of canonical repositories...just like every other version control system, but more complicated and confusing.


What?

    git remote add my_other_server_that_is_not_github git://my.oth.er/server/that/is/not/github.git


The problem is mostly that people fail to actually do this.


Until github goes down, and then they run `git remote add friend ssh://coworkers-workstation//path/to/repo` and push code to each other until github comes back up.


Well, pushing to someone else's local copy is a little harsh, since that copy probably won't be bare. Also git will slap your wrists if you try to do that. Instead, the coworker should pull from you. This is why GitHub's calls them pull requests.


Or you can push to a different remote branch, and then they can merge it with their master.


You can also add multiple urls to your remotes. When you push, it'll go to all of them. I do this at home with a tiny server on a mini-atx box and github. Set it and forget it.

There's no reason you can't do the same thing with a small box in the office and get the best of both worlds.


Is it that simple? Just pony up another $9.95 for a VPS and you've got git access to your repos whenever GitHub is down.


Even worse is other tools that rely on Github, like Homebrew (Mac OS X package manager) which breaks in various silly ways when Github is down.

Anyone happen to know a way around this specifically? Trying to install some stuff and brew just bails after getting 5xx from github.com.


Clone the homebrew repo. Use a cronjob or Jenkins or some such to periodically fetch updates to a server under your control. Modify homebrew and its setup script to use your repo (it's been a while since I did this but the Github URL for the homebrew repo used to be hard coded in one of homebrew's ruby scripts). Whenever setting up a new machine, use the modified setup script.

Periodically you can merge upstream into your clone to get updated recipes. (e.g., the aforementioned cronjob can mirror master from github to remotes/upstream/master in your repo, then separately you merge that into your master).

Your servers are now immune to github outages[1] and you can review recipe updates before your servers update to those recipes.

[1] unless of course the recipes you're using are hosted on github. But you could recursively mirror those sources too and modify the recipes as needed.

edit: I'm on a mobile device but I can provide further details the next time I'm in front of a keyboard in case this wasn't clear.


> unless of course the recipes you're using are hosted on github

Yeah, the problem I was having at the time was that Homebrew was trying to fetch a tarball hosted on Github. I was just whinging though, that was the only time I've ever hit that particular snag.

The system you propose would be great, but more work than switching to running headless Linux VMs on OSX for most dev work ;) This is something I have been drifting toward for a while now, using Vagrant.


Try ports, perhaps? http://www.macports.org/


To be fair to Homebrew, the reliance on GitHub is probably what makes it so accessible: easy for others to add formulae, and no hosting (and lessened security considerations) for them to worry about.


Do you have any info or anecdotes about work actually being disrupted because GitHub was unavailable?



Or for the HN friends without Javascript, http://isup.me/github.com.


Least reliable service that I pay for by a long shot.


Must not pay for very many services? https://status.github.com/graphs/past_month


To be fair 99.77% uptime isn't very good.


That isn't fair at all.

It's 99.770% for a single month, immediately following a major event. If you sampled yesterday (or tomorrow, assuming no further issues) it would be higher. If you just look at today, it's at the much lower at 95.871%. If you assume no availability issues for the last 12 months (not true, but the point remains) then it's 99.981%. During an actual outage, availability was at the unacceptable 0%.

Unfortunately they don't provide 12mo stats, which is what you typically want if you're going to start calculating nines of availability.


Hey, are you seriously defending 3 9's of uptime? That's abysmal.

Github, if they're honest about their 12 month uptime levels would be lucky to be a single 9 service. Their uptime is Terrible with a capital T. But you know what? Until there's something better everyone is going to keep using them, right?

Great services with values that are hard to find become damn near irreplaceable even with terrible uptime. This is an obvious place to compete; if you made a github clone that simply stayed online you could win market share during every downtime. However cloning github would not be trivial.

And therein lies the problem and the answer to why we accept their terrible uptime levels. They give us something we can't get elsewhere: social coding and easy centralization.


Are you seriously incapable of distinguishing "hang on, you are getting numbers that look bad using statistical chicanery" from "Github = teh awesome"?

Pointing out that someone whose point I agree with is using bad math as evidence is not disagreeing with the point, it's asking that people who agree with me behave like honest, civilized, human beings---I don't care that you've already gone through the hassle of getting your pitchforks out of storage.

Speaking of which... your accusation that they are lying means that Github has had nearly 37 days of total outage this year---that they're down for two and a half hours a day, every day, for a year straight. And by honest, I assume you mean "they are lying", as opposed to "they are using a different definition of uptime than I would like." Naturally, you have some evidence for these claims, right?


Lying is a bit of a strong word. I think calling anything a single 9 service should be taken in jest. It's pretty hard to be down almost 40 days and still be in business.

Also, technically something with 98.9999% uptime would still be a single 9 service...


I agree with the rest of your comment, but... a single nine service? You think they have 36.5 full days of downtime yearly, 3 full days monthly, 16.8 hours weekly, or 2.4 hours downtime every day? That's certainly not the case.

Not to mention that often when they have issues it only affects a subset of customers.

https://en.wikipedia.org/wiki/Nines_(engineering)


3 9s is abysmal?


It's about 1/4th the downtime my company's little project server has had in the past month, and that's with SVN, not git, which means workflow was even more seriously disrupted.


Have you tried Bitbucket? While it lacks a lot of the social integration, it has the benefit of giving you 5 free private repositories.


That is false, BitBucket gives you unlimited amount of private repositories with maximum of 5 collaborators.


I didn't even know it was down haha. I've been pushing to a private GitHub repo all morning with no problems.


Database outage causing everything to go down?


Imagine that. A site dependent on a database.


That and their wording suggests a single db server :/


They have a typical master/slave setup with memcached on the front. http://www.slideshare.net/err/inside-github


Since they're using MySQL, is there a technical reason (as opposed to historical/lack of time reason) they're not using Galera Cluster?

I'm in the process of migrating an existing datastore to MariaDB+Galera, and so far it seems like everything I could hope for in a clustered RDBMS.


Last I investigated Galera it lacked support for query caching. Over 50% of our queries are cache hits, so it made it hard to justify using Galera over a normal master+slave setup. However I could see it being useful for setups where a single server can't handle the load (we average 300 queries/sec on a single server with lots of room to spare.)


They still disable the query cache, but MySQL's query cache generally isn't considered all that great a thing anyway, so few people care. You're better off making judicious use of Redis or memcached.

The biggest win for Galera is high-availability that actually works with minimal effort. (I've never experienced a high-availability solution not based on multi-master/all-nodes-hot principles that didn't cause more problems than it solved.)

They also claim some scalability wins at the front end, but I haven't really tested that, and am content with the performance not being terrible.


but MySQL's query cache generally isn't considered all that great a thing anyway

You've never had to prime a query cache on a MySQL server, have you? :)


Of course not. I use a caching layer with lower overhead that doesn't invalidate the entire cache when a single record changes. The query cache just isn't competitive.


The status page says that "Code Downloads" is "Normal". How do you get to the downloads page for a given repo? Is there some canonical form for a URL for repos? Maybe while they're fixing it we can still download code...


try:

  git://github.com/{user}/{repo}.git
or

  https://github.com/{user}/{repo}.git
anyway, not working right now for me (Italy)

EDIT: working, give it 30 seconds to start.


Nope, no joy - 30 seconds or otherwise.


We mainly used github and one of their last updates, our entire repository turned to blank. Nada, no file, no wiki, nothing... We emailed support and tweeted their account. After one week, someone replied asking "is there problem still there?" I thought that was funny in term of how immature this company is. They eventually fixed it after like 2 weeks by restoring the backup database.

Anyways, Github is nice. Use it with caution. It cannot be full blown Enterprise ready grade service yet.


Looking through the past few months of messages on http://status.github.com/messages, I'm really quite surprised at how many service outages, major or minor, are reported. It seems something goes wrong more days than not.


A number of ticketing systems, like Unfuddle and Assembla, include free git repos. Personally I prefer Github, but if you're already using such a system, might as well utilize its git hosting (for a backup if nothing else)


Same for me from San Diego, CA. I am unable to push and unable to access the main page. Getting "This page is taking way too long to load."


im getting a 500 (from NYC) on the main page + i cant push to remote repos


From here (NE UK) everything is down.

Typically I was in the middle of setting up a development environment on a new machine which has a lot of composer dependencies....

Oh well will have to go drink coffee in the sporadic sunshine :).


It's on any page, try visiting any repository, wiki etc - same story.


Checked from India, everything seems to be down...


It's back up for me.


Why Unicorn Why!


So, what's better than Unicorn ?


Two unicorns.



Phusion Passenger Enterprise: https://news.ycombinator.com/item?id=5662569


Passenger :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: