It's interesting that the first half of the explanation is largely quotes from a prior HN post's responses.
Does anyone else feel a bit weird seeing off hand comments getting quoted in the explanation for a business decision? I guess we should all get more accustomed to our public input carrying weight in the zeitgeist.
Gitlab's "develop in the open" nature really shows through here. I am not saying that's bad, it's just so different than most startups and established businesses.
I don't think they were using this as justification or explanation in their eventual reversal. I think they provided these quotes as interesting tidbits. Kind of like pull-out quotes on an article. You can ignore them entirely and still get the gist of the post.
On a more general note: It's got to be incredibly hard to do what GitLab does with their extreme transparency. I feel like we have to be careful about reading too deeply into things and nitpicking their culture or process. HN is full of "expert" advice, much of it being terrible. They weighed their options, invited feedback, then made a decision.
I appreciate what GitLab does in being so transparent. None of us are owed explanations or insight into how they operate, yet they go out of their way to provide it. Kudos to syste and his team!
Thanks. The most useful advice was shared in private. People coming to our office to share war stories of regretting moving to bare metal. I also received dozens of people sending direct messages via twitter because they couldn't share their stories publicly. Some of the best advice are things that we can't share publicly. For example a major company going bare-metal and then spending a lot of time to set up an authorization system that you get for free with AWS IAM.
Did you have to sign a NDA? Because if people send you random advice via email after reading a blog post then this sounds to me very much like "publicly sharing" their experience...
I disagree. It's a private email. Gitlab can ask permission to post, but it would be impolite and perhaps unethical to assume a private email is now "public". If they wanted it public, they could have chosen to provide a comment on HN, Twitter, etc.
Exactly, everything is private unless it was posted in public by the author or explicit permission was given. We have a transparency value, but we understand that other organizations are different. And even we assume that our private communication will stay private.
Apparently according to syste some of the best arguments came from emails, so he could have put those arguments anonymised and trimmed down to the main points into the blog post, which would have respected the company's privacy while still sharing good arguments, unless there was a NDA which would prohibit that.
So I disagree with you, because you look at this very black and white while the reality is grey.
We quote a lot of responses from HN but that doesn’t mean it’s the only thing that led us to this decision. I was personally involved in a lot of private conversations about this, with the executive team, with investors and with people who had gone through this exercise before, and they all had as much if not more impact on our final decision than the HN thread did. It’s just harder to quote them than referencing additional (& similar) opinions noted in a public forum.
That is good to hear. My comment was not meant as a negative criticism of your process, although I can see how it could be interpreted that way. It was intended more as an observation about the openness that Gitlab brings to the process of design and decision.
Well... Here is something very interesting. Myself and others were "cautioning" very heavily against moving to self-hosted gear. It wasn't until their CTO talked to somebody in person he had respect for that they flipped over to staying in the cloud.
This to me calls into question the entire idea of crowd-sourcing decisions and plans like this. It was a single person who moved the needle, and now there is retroactive agreement(and quoting) with the people who were against it from the start. If it went the other way there would be plenty of quotes to pull from of people supporting the move to dedicated.
Recommending someone to build in the cloud is an easy and safe recommendation to make. Therefore, more people will go that route when asked what you should do.
Pretend for a second you plucked someone from a royal family that never prepared their own food, and they asked you how to make food. Are you going to recommend they purchase land, start a farm, grown their own vegetables, and raise their own cattle? Hell no. You tell them to go to the grocery store (cloud) and if they are desperate, they can purchase some pre-made food at restaurants (SaaS providers).
When you don't know anything about a person's exact needs or abilities, making the safest recommendation is easiest, and therefore it proliferates.
I hate the cloud but I know how to do what I want on my own without it. You don't know that about other people.
I don't think the cloud is the grocery store. It doesn't give you a pre-packaged solution that you can just rip open and immediately access.
Rather, the cloud is like leasing the land from an owner. You still have to till, plant, maintain, and harvest, but you have to pay an annual rent to do so. There are a few things that the landowner is responsible to provide, but most of the day-to-day maintenance responsibility is on the lessee. People generally recognize that while renting is sometimes necessary, it's much nicer to be an owner.
We really need to stop this belief that the cloud is a magic thing that can fix your problems. It splits out the responsibility for actual, real-life hardware maintenance to a hosting company, but other than that, the responsibilities, including the responsibility to take backups, set up security, and design a fault-tolerant application, remain in the cloud user's court.
Cloud is just a way to rent a virtual server; you still have to make the server work correctly.
I think most cloud providers go beyond that. For example, creating/restoring snapshots with a UI, monitoring/automation tools, large scale DB services, etc.
I rent dedicated servers from a large-scale provider. They're really good at hardware and networking, but that's all they do (well, that and some OpenStack S3 equivalent). I have to do my own Icinga monitoring, Ansible automation, etc. For the type of product I run, it's fairly simple. It's maybe not as cheap as owning the hardware, but I think it's a good compromise.
Not to mention that when everyone is using that same tool from Amazon, and at some point it gets big and clunky and its development stalls because people rely on it never changing.. then other options out there eventually leap frog. (I strongly dislike the google-cloud UI, I waste so much time when I have to use it for one of my clients that has stuff there)
Snapshots can be created and restored with bare metal if you install the right software. Only basic metrics are monitored by AWS by default (have to pay more for more detailed monitoring) and alarms have to be specifically configured, as I recall. Note that "bare metal" doesn't necessarily mean non-virtualized; the end user can be running VMWare or Xen on owned hardware and get "cloud-style" flexibility and goodies like automatic monitoring.
DB services is a different class of problem than bare metal v. cloud application servers, and cloud DB services can be used whether you run your application on bare metal or cloud instances (but, for the record, the same type of things are mostly applicable here, in that the user still has to do their own security and configuration).
When I evaluate any tech project for use in production, I always give HN a cursory search for materials. It's not always fruitful, but in the cases where I stumble across something big in the HN space, I pick up valuable intel here...
I still try to research elsewhere too, just to cover my bases, but HN is quite helpful as an archive of knowledge. Especially knowledge about running something in production.
No surprise here. Honestly, their team seem to be lower skill or less experienced than I'd have thought.
I would never approve a production system as expansive as Gitlab's to only have two databases in a cluster. That is asking for trouble, and any {sys,db}admin worth their salt will tell you the same. As soon as you need to do anything on one database, you've just lost your cluster policy.
The lack of automation, especially around validating db backups, failover (not having the failover process scripted and tested is _begging_ to have a nightmare at 2am where you're reading documentation on how to fail over a db), etc.
The simple thing of having your hostname / $PS1 say the machine's purpose could have stopped this. All prod machines have a bright red PS1 and a clear name of <type>-<service>-<prod/dev/etc>-<region>-<dc>.corpnet in my setups.
All of this is reflected in their discussion style, meeting style, etc. Ad-hoc, not very carefully designed, random off-hand comments. Obviously a young team with a lot to learn. Nothing wrong with that, but a lot of customers are relying on their skills! Learn quick!
> I would never approve a production system as expansive as Gitlab's to only have two databases in a cluster. That is asking for trouble, and any {sys,db}admin worth their salt will tell you the same. As soon as you need to do anything on one database, you've just lost your cluster policy.
I do agree with you it reflects poorly on GitLab for only having a primary and replica, with broken backups.
BUT, they are a startup, and they need to be laser focused on growing the company and securing funding for the next quarter so they can keep the lights on.
This kind of pants on fire growth in a startup often (always?) comes at the cost of redundancy and best practices. If you stop to make your platform bulletproof instead of the new features you promised customers/investors, you die.
I'm not saying this is an excuse for them to permanently shrug off making their platform more redundant and engaging in best practices. But, as with many startups, they're focused on delivering features as fast as possible to grow their user base.
I think, and hope, that their recent outage has been the experience they needed to prioritize their shift toward more redundancy and best practices.
"BUT, they are a startup, and they need to be laser focused on growing the company and securing funding for the next quarter so they can keep the lights on."
Having run and sold a few startups and turnkey operations, let me tell you - if you don't focus on your core product first and foremost and demonstrate the utmost competency in it, you're just pissing money away and are likely to fail.
> if you don't focus on your core product first and foremost and demonstrate the utmost competency in it, you're just pissing money away and are likely to fail.
I don't have direct experience running/selling a startup, but I have worked for several as an employee.
In my experience, the product is important, but not necessarily the deciding factor in whether the company succeeds. Most of the success is down to the management team (e.g. CEO, CTO, CFO) being able to sell the company (and product) to investors during funding rounds.
Doesn't matter how good the product is, if management can't pitch it to get funding you're dead.
Obviously it is important to have a good product, but just look at Uber for example. Losing billions of dollars per year, with no clear path to profitability, and yet they're valued at something like $60B. Great sales/marketing by their management to investors!
"Doesn't matter how good the product is, if management can't pitch it to get funding you're dead."
Good products literally fund themselves and don't need outside investors. Just like drugs sell themselves, a good product will sell itself without the need for anything like marketing or investors. The money will naturally come.
Does anyone else feel a bit weird seeing off hand comments getting quoted in the explanation for a business decision? I guess we should all get more accustomed to our public input carrying weight in the zeitgeist.
Gitlab's "develop in the open" nature really shows through here. I am not saying that's bad, it's just so different than most startups and established businesses.