You are completely right. There are reasons to oppose the cloud, but maybe they should focus on improving their systems before moving out of the cloud. At this point in time it is clear that GitLab lacks the talent to run everything themselves. I mean 5 backups worthless or lost? You can't let interns write your backups system. After all backup is a large portion of their product.
The worst part of the whole episode, even worse than 'deleted the active database by accident', was '(backups) were no one's responsibility'. This is not an oversight by an individual engineer, but an aspect of the management and company culture. It shows they lack processes derived from requirements. Lots on introspection required from gitlab at this point.
Yes. This should be treated as a serious management failure. Blame does not lie with the individual who made a simple mistake; it lies with the supervisory structure that allowed simple mistakes such as this to result in major data loss (and, as discussed in yesterday's thread [0], has made a series of other serious strategic mistakes that have likely caused them to end up with such inadequate internal hierarchies).
Something like this is not a mere oversight on the part of technical leadership; it's either negligence or incompetence. Whoever is responsible for GitLab's server infrastructure should be having very serious thoughts right now.
Smaller companies that do not have enough senior/good technical guys that they can afford for whatever reason they have... benefit greatly from the cloud. 1 master, no read partitions, weird backup policies and the saviour of the day is some engineers lucky manual snapshot. That sucks. It's better people start with cloud and manage when they are really confident.
It's worth noting that compared to a good number of recent ish startups, GitLab now has (I believe) more than 160 or so employees. Someone could've owned a recurring task to work on backup processes (and I imagine, now, someone (or likely multiple people)).