Hacker News new | past | comments | ask | show | jobs | submit login
Gandi loses data, customers told to use their own backups (gandi.net)
447 points by webrobots on Jan 9, 2020 | hide | past | favorite | 372 comments



Whoops, so long with the "no bullshit" policy.

I stopped using them a while ago but for a different reason. I used to use their website to check availability/whois for domains that I was interested in buying. If it was available I didn't buy it at the time but until I finished the website/app whatever I was going to put there, this took me a few months obviously. It happened to me that when I was finally ready the domain had already been sold to someone else. This repeated five times during six or so years. Now, I know, "someone else could have thought the same thing" but I find it very hard to believe that it happens so often. These domains were a bit of niche words that were not hot topics at the time, some of them using fairly uncommon TLDs (like .one). Another weird thing is that they were always registered to someone living/or doing business at India, and it was a fairly simple landing page with a "contact me" link. I'm a bit superstitious so I don't think it was a coincidence.

Now, I don't think this is a GANDI problem per se, but my theory is that they share this information (who is looking for which domains) with marketers or something like that, or maybe it was a rogue employee trying to make some money squatting domains. I would have expected this from BigDaddy or similar sharks, but from a company whose motto is "no bullshit" I had much better hope. Anyway, I decided to move (to namecheap if you're wondering) and surprisingly the problem went away.


Domain registrars giving away domains to squatters when people search for them is a time honored practice. The advice I've seen is to not search for the domain first, just register it outright from the start. The domain registration business is run with all of the integrity and customer focus of TicketMaster.


Really? I may have just found myself a new hobby. Search for incredibly unique (and worthless to me) domains to see if I can get people to squat on them. Heck, it could be a game ... I could get all my friends to make bingo boards ... or maybe see if I can think of some scrabble like rules.


The squatters do a thing called "Domain tasting".

They buy domain X and then they put generic advertising, maybe keyword based, on a cheap bulk hosted site. They measure for a few days - is this bringing in lots of revenue? If not, they cancel the purchase, using a "grace period" available to users of the registry in case of mistakes - the purchase is unwound and they are refunded the fees. Domain X is now available again.

In principle this is forbidden for major TLDs but it's still possible and unscrupulous vendors help them do it, albeit it may now attract a fee if you do it enough that the TLD registry detects you tasting.

https://en.wikipedia.org/wiki/Domain_tasting explains about this and related practices.


There are slight transaction costs of $0.18 - $0.20 that is not refundable. Miniscule, but it prevents some domain tasting at scale.


I suspect a human gets a dump of them and decides which to pay the $10 for.

For example, asdasdahbdajsdbajdbhsbdahsdd.com... not worth the $10

ireallylikechicken.com... maybe worth the $10?

(ireallylikechicken.com is available, go squat it and get rich)


Someone did buy it it!

> Creation Date: 2020-01-09T19:37:01Z

Registered with Gandi, ironically enough...

And the domain is meta! Response from HTTP GET

> Location: https://news.ycombinator.com/item?id=22001822


er.... was it you?


It was not


Hilarious


ireallylikechicken.com owner.

I'll give you 20 for it?


I’ll give $1 for a 0.01% share.


I'll buy an option to take a 0.01% share in 12 months for $0.50


Only a 100% markup? I am sure it will be worth more than that in the future ;)


That’s what they said about Bitcoin last year


>That’s what they said about Bitcoin last year

Bitcoin price in USD on January 8 2019: $4004.12

Bitcoin price in USD on January 8 2020: $8045.51


But Bitcoin has increased by 100% over the past year.


I am sure, in the past, one of the domain registrars took the liberty of actually registering your searched domain, deliberately, so that you had to go through them to get the domain later on? - I can't remember who it was, but it was an automated process.

I know it was very shortly stopped once people complained, but it goes to show that it has been done before.

Getting rich off domain's - sounds like a solid business plan!


I've got no evidence except anecdptes, but I've heard claims that GoDaddy used to do this a lot. For this reason I've only searched a domain I was going to immediately buy, otherwise I'll do a whois lookup to see if it's registered.


Just looked about and found these:

"Does GoDaddy register domains you search?" (2011) https://news.ycombinator.com/item?id=2326790

Within this, was the comment that reminded me which company it was I was talking about (https://news.ycombinator.com/item?id=2327152) - Network Solutions used to do this - and there is a Wiki link in there which gives the process a name! TIL! :)

https://en.wikipedia.org/wiki/Domain_name_front_running


I remember something about registrars doing this so the domain you wanted wouldn't get snatched by someone else while you were in the process of buying it. Made more sense back in the time when far more basic names were available and such collisions were probably common for high-value domain names. Now it's two people finding the same needle in a haystack.


I knew a guy named pachell that did this from 1997-2001 until registrars changed the rule. pachell wherever you are, i miss you


hahahaha, the webpage redirects to this discussion!! WELL PLAYED!!! (sorry for the caps.. but wow!)


YW ;)


>The advice I've seen is to not search for the domain first, just register it outright from the start. The domain registration business is run with all of the integrity and customer focus of TicketMaster.

What if you don't want to cough up $10-$20 on a whim? Would doing a whois (using the NIC's whois site) suffice?


Doing an OOB whois search is almost certainly fine. It's when you search on godaddy.com or whatnot that you get burned.


What is an "OOB whois search"?


Out of Band. Registrars cannot intercept whois searches?


I've had this same scenario happen before, and since then I just issue a `whois` from the command line to bypass any potential frontend interception. Not sure if that is 100% full proof either though.


I have used whois cli since the early 2000s because I did not trust the registrars as domains I just searched ended up being registered. Never had that issue again since then.


Isn't the WHOIS query sent unencrypted? Can't it be intercepted in that case?


I use the whois command line tool when searching and have yet to get squatted. My experience is only about 500 domains over 20 years.


i use host -t NS cooldomain.example as a pre-filter. If a domain has NS records, it's definitely registered, although there are some registered domains without NS records (makes them pretty much nonfunctional, but if that's what the registrant wants, it's their business)


This is a neat trick, thanks. Adding to my toolbox


If I understand the domain name infrastructure correctly, that would imply that it's the registrar who is collaborating with the squatters. A command-line whois query would still have to query the servers of the registry for a particular domain (others on this thread speculate that it may be the domain registry that shares data with the squatters).


Curious about this as well. When you query the servers of the registry [gandi, namecheap, godaddy] for a particular domain example.com, doesn't it update the one of the datetime fields for last queried?

Then again, the squatter would have to know what to search. Isn't it against rules for domain registrars to publish their recent query history [private or public]?

Any more light on this subject would be greatly appreciated!


On Namecheap they have a "save for later" button per domain in your cart, so you can just pile them up for later.

Never had even 1 of my "saved for later" being squatted, for several years now, so I really have to trust and commend namecheap in that regard.


They never had a real "no bullshit" policy. When I had my domains with them, I had been asked to verify my identity 34 times in 12 months. 34 separate fucking times. Because "ICANN says so" or some stupid shit (their words, not mine). It stopped the moment I moved to Google Domains, where they asked once and never again.

EDIT: And, to make things worse, each time I was threatened with the "confiscation" of my domain, and the round trip on the tickets was so high that each instance took 2-3 days to resolve. Frustrating as hell.


Since you're giving a anecdote, let me do the same. I have about 30-40 domains with Gandi, and have been using it for about ten years. I don't remember ever verifying my identity, but guess I most have done it at least once. I have not been asked to verify anything for at least the last five years of using it.

Disclaimer: I don't work there or have any relationship, except I'm a happy customer


It was a matter of them refusing to keep my identity on file, and the threatening tone of each ticket. It grew tiresome quickly.


Sounds more like a bug than anything. Why would they want to not make it easier for you if they can?

Seems you missed my point though. Both of our anecdotes doesn't really say anything, in terms of if Gandi is good or bad.


His anecdote does say something though. It suggests that Gandi has a "if we have a bug, it's your problem not ours, sucks to be you" policy, which is exactly what has happened with this data loss issue as well.

Actions speak louder than words. Google famously has a "we don't have bugs, you just don't know how to use it, talk to the hand" policy for example. It is better to learn about the policies due to minor issues rather than major. OP learned of it early on and moved away with little trouble. Others did not learn until now and stayed, and now they are SOL.


I’m a gal, but exactly. I really wanted to support a company at the time who was supporting the community (they were a freenode sponsor), but I just hated dealing with the stress and potential that my domains could just disappear over night.


Same, I've been using Gandi for DNS and email for 9 years, without any issue or request on their side.


On the other hand, i've started using them more for DNS because the one time I forgot my password (typo in password manager I think) they made it very difficult for me to reset it, asked for pieces of ID, phone number registered in my name etc...

This is at the time of the stories of other registrars giving customer second and third chances to guess their PIN, or credit card or whatever mechanism they had, and resulting in domain hijacking.


> asked for pieces of ID

This is annoying for everyone but the adversary who can just spend $50 to buy a set of fake ids with your info.

Especially since Gandi doesn’t store your old IDs, they aren’t even going to check if the info on the fakes matches the ones you provided previously.

> phone number registered in my name

I can’t imagine this working very well, just give them a number from a country where they can’t verify who owns the number.


>They never had a real "no bullshit" policy. When I had my domains with them, I had been asked to verify my identity 34 times in 12 months. 34 separate fucking times. Because "ICANN says so" or some stupid shit (their words, not mine).

Can you elaborate what the "verification" entails? There is an ICANN requirement[1] to validate whois information, although I've only been asked to validate email (at another registar, not ghandi).

[1] https://www.icann.org/resources/pages/approved-with-specs-20...


Wanted photos of passports, but they would always reject the first one for an unknown reason. The second one would always go through, but I do not understand why they wouldn’t just keep it on file. It was more than twice a months usually, and that was absurd.


Jeez. Like most experiences I suppose, it's hit and miss. They've promptly resolved every problem I've had and I've bought plenty of domains through them.

They're still my go-to provider.


Gandi was deploying the ID verification as a bullying tactic long before anything like that was mandated by ICANN. (Not that ID verification is even mandated now)


I've heard that this isn't actually the registrar's fault, it's the registry's fault. So your TLD is sharing the "is registered" query with other parties. That said, everything about the domain registration industry seems designed to appear sketchy as all hell, so who knows.


If you want an alternative to searching with a registrar you can always type:

  whois mysupergreatcoolappidea.com
into a terminal window and see if you get back a result


WHOIS and DNS requests are made to nameservers run by the registry (not registrars), it's possible for the registry to front-run domains if they intend to.


This happened to me with GoDaddy and Namecheap before, which is why I switched to using Gandi for all my domain searches... Now I'm regretting it!

But as @Jasper_ said, this could be a problem with the domain name registry selling/leaking that info (AKA all their 'is_available' queries), and not the registrar.


At one point, I believe a GoDaddy VP was doing this as a personal side business. For many reasons, GoDaddy is the shadiest of them all.


You shouldn't regret it. GoDaddy was and still is way worse than this, there is no comparison.


I was always told that Namecheap does not engage in this practice.

It's a single data point, but I instructed a client to search for domains on Namecheap last year since they were undecided. I just didn't want them to use GoDaddy, and I warned them why. They settled on a domain but registered it months afterwards. It was still available.


I have the same story with Gandi. Searching for a domain (several times even, on a .com) and was still available months later.


It could be your local DNS resolution that is leaking to bad actors. It would be kind of stupid and self defeating for registrars to undercut their own customers. I would expect that some have done this in the past, but would be very surprised if it is done at Gandi with their knowledge... and undoubtedly French law would not smile kindly on such behavior.


Although the current issue with the irrecoverable data loss is terrible, I thought (in this case, at least) that they were surprisingly honest. They straight up said the data is gone (a VERY hard thing to publicly admit), and informed people they need to restore from their own backups. That seems pretty No Bullshit to me, no?


I have had a similar experience with other registrars.

Edit: Sad to hear of the data loss and for anyone affected. Trusting cloud providers doesn't always work out either.


Nice to know I'm not the only one! :D


Reports of reputable registrars front-running are persistent, but unfounded. Anytime I’ve looked into it, I’ve never seen any evidence for it.

If proven it would be a major blow to their business, so why would they try to snatch pennies from in front of a steam roller?

So I call b.s. on any reports of “the registrar noticed me searching for a domain and registered it”.


Um, NetSol settled a $1MM class action suit over exactly this about a decado ago.

It absolutely has happened and quite possibly still does.


Would `whois` be any better to prevent leakage of domains you intend to register?


Oof, this Twitter thread looks particularly bad, especially the response from the official Gandi account.

https://twitter.com/andreaganduglia/status/12151991477012316...

While I appreciate that there are real people behind these companies that are probably having a really rough time right now, the criticism that Gandi are getting as a company is justified - and if Gandi are truly a "no bullshit" company they need to put something out to their customers asap.


Screenshotted in case (when) they delete it https://i.imgur.com/s3R1VVc.png

Using memes after permanently losing customer data is extremely disrespectful.


"Julie Pelloille @juliepelloille Replying to @gandi_net @andreaganduglia and 4 others

This post was disrespectful. It's not an excuse, but this is a stressful situation and the thread was getting heated. Either way, I truly regret posting it and it was my decision alone to do so. Please don't take this as representative of the high standard Gandi sets"

"That said, for the sake of transparency, we won't be deleting the tweet -- Julie"


I like that. Honest mistake. Simple, truthful apology. Transparency for the record. Julie's one of the good guys.

Whatever the context / stakes (doesn't change anything in this case), this is how people should behave in life (not just online).


So what. Just because you send a retraction doesn't justify it or make the apology any better. That sounds like they allow some people too much freedom as if they ran this business in their parents garage.


Why are they going out of their way to be disrespectful to their customers during a crisis? This is bizarre.


Probably because “they” == some individual social media rep


There's at least two of them going out of their way to be snide and snarky there - Julie and Stephan.


Stephan is the damn CEO of Gandi! Unbelievable.


Is is what happens when a CEO thinks everyone else is just under him/her, including their own clients.

They start thinking they are beyond the normal people and that everything is a joke.


That's a wild assumption to make from a few tweets. It's just a stressful situation for both sides that leads to rough comments.


Eh, he sounds more frustrated and snippy to me. Still rude and unprofessional behaviour, of course.


...not losing data is the ONE thing I expect companies to get right. I could handle downtime, circular customer support, high prices, horrible UX, and all that. But losing or corrupting data? Heck no.

A company that loses customer data in production is the exact type I would expect to mock their customers using memes.


I don't blame the communications rep. From her perspective, she's probably been told what the CEO believes - Gandi lost data, but they never promised backups so it's not a big deal. They responded to someone that is being extremely critical. The rep (Julie) did the right thing and apologised after others criticised her tweet, and also kept the response up to illustrate the mistake. While a meme is bad taste, I can somewhat understand the reaction.

IMO, the blame lies solely with the CEO, because he is still to retract his statement regarding snapshots not being backups (despite their site selling them as backups to the end-user), and for not accepting the fact that for someone controlling business data that creating backups AND regularly testing them via restores is 100% essential. Culture trickles down, and if the CEO only accepts blame and not the reason for the blame then it's a sign that they won't learn from the problem - and that's the biggest red flag you will ever see in ANY business.

I can only see one way back for them that won't taint their reputation completely. They need to:

* Post a full post-mortem of what happened, how it happened, how they fixed it, and what they're going to do to ensure it never happens again.

* Issue a full apology for the problem. Accept full blame, and accept (including the CEO on Twitter) that Gandi failed to follow accepted industry standards.

* Sit down with the engineers that work at Gandi and hear their grievances. While I doubt that their engineers knew this would happen, I'd be willing to bet that there is at least one person there that had raised the lack of off-site backups and no recovery mechanism. That person needs a promotion, and whatever resources needed to fix Gandi.

* Issue a full refund to those that lost data - not a small discount, as already reported. A discount is a kick in the teeth, whereas a full refund is the start of a real apology for failing the customer. If you go for a meal at a restaurant and find broken glass in your food, the first thing the server will do is give you a full refund, no questions asked, regardless of how expensive your parties order was. Gandi need to take the hit, and live to fight another day.


Yeah, that's not a great way to win back the trust of your customers.

I'm going to look at moving my domain registrations away from them.


I am moving my business away from them. Even if I didn't care about the backup situation, the PR response is stupidly immature and not worthy of reward.


Your mouse pointer.. it looks familiar! https://i.imgur.com/XGK3tFT.png



KDE users unite (:


omg wow! when will people start being held accountable for the BS they put on twitter?!!


Meh. It's Twitter. They lost data but arguing on Twitter circularly forever with these people solves nothing.


It has long been the platform of choice for those seeking a response or to resolve issues. The tone of those ill mannered tweets reveals incredibly bad optics and serves as a reminder for those, who haven't encountered any issues thus far, that we could be treated in a similarly shoddy manner.


Yeah, but then he should shut up.


Ironic considering her twitter profile's description says "Responsable #communication #socialmedia #digital #innovation ⌨️#webmarketing #inboundmarketing #qvt #GOT et #TWD fan"[1]

[1]: https://twitter.com/juliepelloille


Why are they @&$ing around on Twitter when they should be fixing the damn problem. Unbelievable.


It's almost as if the person managing the official media accounts is different than the person working on fixing the problem. Almost.


Well, that is explainable, unlike not making backups; they have what's called a PR or media team that gets updates and details from developers while they work on this.

Additionally, data recovery is a lot of waiting in most cases, there isn't much to do as your business burns down around you


This is god damn unbelievable

"Andrea, sorry about that and the incident. If we led you to believe that you had nothing to do on your side when warned multiple times to make your back ups, then we'll have to make it clearer, and stop assuming that it's an industry wide knowledge."


Big words for a company that's in trouble for not backing up data themselves.


Most web hosts have some courtesy backups, but it does sound like the Twitter user they're responding to fundamentally doesn't understand that snapshots aren't backups, and the screenshoted page explicitly states that the snapshots are for you to back up. Which he presumably did not do.

The idea that someone would entrust their sole copy(s) of critical business data to a service provider is insane to me. Always keep your own backups.


> The idea that someone would entrust their sole copy(s) of critical business data to a service provider is insane to me. Always keep your own backups.

You can consider it insane, they still sold snapshot as being backup. Insane or not, it doesn't change that's what they sold wrongfully.

Can you point me where that screenshot show what you say it does? The user goes further to specify that you CAN'T download theses snapshots.

Companies should be called out when they lie about what they sell, I hope you understands why it's important.


1. He says he has made regular backups, but now needs to restore all VPSs

2. The website says "Snapshots allow you to create a backup copy."

3. He says "No they do not allow snapshots download."


It sounds like snapshots are directly reachable from within FTP in a directory. Snapshots are a clean copy of the file system you can back up, but they are not backups.

He also states he has his backups, so he's mostly just whining because he's annoyed he has to reupload stuff. Which I get, but again, he should understand what snapshots are and aren't.


You do the same mistake as the other guy in the twitter thread, you mix Simple Hosting snapshots and the Cloud hosting volume snapshot.

Here the right one which state that they are backup: https://docs.gandi.net/en/cloud/volume_management/volume_sna...

Here's the one that you quote (which isn't the same service): https://docs.gandi.net/en/simple_hosting/common_operations/s...

Be careful next time judging with that little knowledge of the issue.


Interestingly, this page has been edited to add a warning: "A snapshot is a frozen version of your volume that allows you to restore it to a previous state. It should not be regarded as a backup of your volume. If your volume is deleted, all related snapshots will be deleted too."

Here's the page as of earlier today. https://web.archive.org/web/20200109194005/https://docs.gand...


As @dwild stated, that's the other type of snapshots used for web hosting (which I assume are copy-based backups due to how little storage sites usually use). These snapshots are never reachable directly to the customer; at best, they can restore a volume back to the snapshot's state or create a new volume at that state and attach it to a VM.

> He also states he has his backups, so he's mostly just whining because he's annoyed he has to reupload stuff.

Or they're annoyed that they paid for a service, at the very least billed as backup, only to be told "welp, it's gone".


That's such an obnoxiously passive-aggressive response from the CEO. Bit of a red flag for the company culture.


After another support person made a joke in response to his very serious post.

This is one of the worst responses I’ve ever seen from a company, and I’m not being hyperbolic.


Seeing the thread I couldn't believe they are being serious. Feels like they are playing a tasteless prank. Such crass and careless attitude is downright repelling.


Sorry, not sorry.


The number of people who have control of social media accounts for companies who do not understand how to relate to people / basic customer service / can predict how their post will be received is shocking.

I worked at a company of 5K+ people and one of the folks in control of the twitter account(s) would come to me with questions.

Now I applauded them for coming to me for technical questions before posting, that was great, but they absolutely did not have the self awareness / understand what to say / when to say it and etc.

But hey they were tied to a high ranking person (who also had no clue) so they had access to the account.

In my early days I worked PC customer support... I feel like that comes in handy all the time.


> can predict how their post will be received

That's an AI-hard problem and remains unsolved.


if positiveReception < 100 postMemes = 0


Wow. I've never used Gandi but I have seen it recommended before as a low-cost option. I will actively encourage people to avoid it from now on. That's scary.


Gandi has never been a low cost option, they've always been on the high to extreme higher end of things for individual cost...

Especially for random ccTLDs, they're often significantly more expensive than the alternatives.

Random selections for domains: .ru is $1-3 most anywhere else, Gandi is $18.


High to extreme higher end would be something like MarkMonitor.

Gandi might be better than some of the other low touch, self service domain providers but its definitely still in the same ballpark. $18/year still means they're losing money if they ever need to pick up the phone for you. It's not a price point that works with "higher end".


> High to extreme higher end would be something like MarkMonitor.

Being a registrar is only a side effect of their business though, not really comparable.


They used to be very good if you wanted a non-scammy registrar with a huge selection of TLDs and ccTLDs. However, in recent years, success seems to have gone to their head and the service is nothing like it used to be (plus their latest control panel UX is an abomination).

Feels like the CEO has made his money, forgotten the company's roots in the process and is happy for Gandi to be just another generic, overpriced registrar running on auto-pilot.


Guess they cut cost by not doing backups


I don't get the criticism.

If they lost all the data, then obviously the only option for customers is to either use their own backups if they have them or accept that the data is permanently lost.

One can criticize their lack of additional redundancy, but don't see what's wrong with the response.


Sure, if the data is lost there isn't much that can be done to go back and fix it. However, the company response appears very dismissive/flippant which sends a bad message.

The tone any company hosting customer data should take in the event of data loss is along the lines of 'regretfully... we screwed up... unfortunately... steps we are taking to ensure this doesn't happen again...' i.e. the company should either be humble and apologetic or they should expect to lose a large chunk of their customers after something like this. This isn't merely to say the right thing, it is to demonstrate that they acknowledge this was their issue and something they need to fix going forward rather than a 'sucks to be you' customer issue. This is basic customer relations / crisis management stuff.


So you're saying you want the bullshit? Look Gandi doesn't have it. What more do you want from them? They lost it, they're not gonna bullshit about it.


The customers are being stupid and rude: assigning blame, asking redundant questions, making threats. Nothing in any of the twitter threads I've seen has any potential to solve any problems, they're yelling thinly veiled abuse at support.

The industry standard is sucking up to them and groveling, and it's led to customers being very badly behaved.

The trouble is no one has a good working alternative to the industry standard.

Gandi certainly doesn't, they're not responding in a well thought out manner, they're losing their cool and getting angry with their customers. That's a quick way to go out of business.


The alternative is simple: always behave professionally, and if they are abusive, point at the ToS that forbids that and fire them as customers.

Here I'd just avoid engaging one-on-one at all, just broadcast the situation status.


i mean, this is customer support 101. always be respectful even when your customer is angry -- they are probably angry for a reason.

answering with memes is the absolute opposite of this, specially when your customer has all the reasons to be angry.


It's justified in being called simple if companies are actually doing this.

I did find HubSpot[1]:

> We may limit or deny your access to support if we determine, in our reasonable discretion, that you are acting, or have acted, in a way that results or has resulted in misuse of support or abuse of HubSpot representatives.

I'm still skeptical because actually enforcing that clause seems like it could lead to an expensive lawsuit. The angriest customers are naturally the most litigious ones, too.

[1]: https://legal.hubspot.com/terms-of-service


I’m pretty repelled by their tone in that thread. Sweeping it under the rug (could’ve happened to anyone / shit happens) instead of just owning up to it. Throwing in that completely inappropriate meme. Contradicting their marketing material when it’s convenient (are snapshots backups?) and general passive aggressiveness.


Julie Pelloille, responsible for comms, appears to be going a bit too far with this.


Wrong or not let’s be careful about using full names as it’s how these things get whipped up into pile-ons.


The Cersi thing is wrong on so many levels.


Just to play devil's advocate: This is in no way different to how Azure, AWS, and GCP operate. They don't have backups either. They too rely on n-way replication, a bit like a distributed RAID.

All cloud providers make it absolutely clear, in black & white, that protection of your data is your responsibility, not theirs.

What I find hilarious is that most cloud providers only provide built-in backup functionality for a tiny subset of their services.

Ask Microsoft if you they have a "backup" button for Azure DNS Zones. Or Azure load balancers. Or anything else that isn't a VM disk, App Service, SQL Database, or a Secrets Vault.

I mean, look at this insanity: https://docs.microsoft.com/en-us/azure/backup/backup-azure-f...

"Backup for Azure file shares is in Preview."

After 10 years of operation, this trillion-dollar company has only a use-at-your-own-risk beta for data protection!

Don't be too hasty to point fingers at Ghandi and laugh about how they're unprofessional. Whatever you're using is essentially the same.

Ask yourself this: Could your organisation recover if some malicious admin simply deleted all Azure Resource Manager resources in one go using PowerShell?


Everything you say here is true, but at the same time it's just a fact that Gandi lost a lot of customers' data, and AWS, GCP, and Azure have never (as far as I know) lost a significant amount of it at once. You can talk about theoretical responsibility for data, and it's true, you are responsible for having backups of your data, no matter how many "9s" the service has, but the basic fact is that some services have been consistently good at not losing customer data, and others haven't. Even though I'm going to back up my data no matter where it is, I'd still rather use the service that's got a better track record with it.

I haven't ever even lost a file on Google Drive, which as far as I know provides no reliability guarantees at all.


Back in the early days GMail lost customer data due to storage corruption. It has happened.

The rarity is immaterial, the responsibility for data protection lies with you, not them.


> The rarity is immaterial

Of course it's material. If a provider has a 0.001% chance of losing some of my data in a year, I'm an idiot for not having backups. If a provider has a 10% chance of losing some of my data in a year, I'm an idiot for not having backups and for using that provider.

GMail is (usually) not an enterprise product and not a paid service, and provides no reliability guarantees. And yet it seems to be pretty damn good in practice.


Well to be fair Gmail was still in Beta ;)


I believe since then they had similar events but were always able to restore from tape. So GMail definitely has proper backups, even for free accounts (maybe not tape anymore, not sure).


That's kind of like saying there's no difference in safety between an airliner and the winged contraption that my idiot brother built in his garage.

After all, they both have wings and will both kill you if they fall out of the sky, and I don't see Airbus or Boeing guaranteeing that their planes will never crash, so they must be essentially the same.


>Boeing

That just confirms the parent comment


> Ask yourself this: Could your organisation recover if some malicious admin simply deleted all Azure Resource Manager resources in one go using PowerShell?

We have streaming replicas for hot data AND regular snapshots shipped to offsite cold storage, because RAID is not a backup. If we experienced an equivalent event, we'd be fine.


The equivalent scenario to recovering from a bulk erasure of all Azure RM resources is this:

How long will it take you to recover if someone deleted your switch configs, reset the SAN to factory defaults, wiped you firewall rules, deleted you Active Directory accounts (or equivalent), and then ran a secure erase on every every physical server just to raze everything to the ground and salt the earth?

I mean in wall-clock time, how long would it take your team to even figure out what is going on? Where would you start?

Would you recover the switch first, or the server that you use to authenticate to it using RADIUS or LDAP?

How will you securely connect to servers if your CRL and OCSP servers are down?

How will you get access to your passwords if your file server where the key blob is stored is saying "Insert boot disk"?

People think that disaster recovery is for "I deleted a folder".

Disaster recovery is for disasters.

Removing all Azure resources wipes everything. Your vNets... Poof! Your public IPs... Poof! Your internet-facing DNS zone... Poof! Your authentication credentials... Poof! Gone, gone, gone.

How do you plan to restore dynamic IP addresses to their original values?

How do you plan to restore DNS Zones that get assigned to 1 of 10 randomly selected server pools and hence have a 90% chance of requiring a change to the NS server glue records on restore?

Do you even know which order things would have to be restored in to prevent failures during a restore?

Could you possibly work out what is missing if you log on to your cloud portal and see the "Welcome to Azure, to get started click here" splash page?

Get it?


> The equivalent scenario to recovering from a bulk erasure of all Azure RM resources is this

It just occurred to me how much easier it is to wipe everything in the cloud age than the on-prem age. Doing all the things you said for on-prem takes some serious effort. Some, like factory resets, may be impossible without individual physical access. You would probably be discovered and stopped before you can inflict much damage. In the cloud age however, it takes orders of magnitude less time and effort to inflict the same damage.

It is kinda like how much easier it is to steal data now. Before the digital age, stealing as much data as Equifax hack would have required moving truckloads of paper without being discovered. It was simply impossible to pull it off in reality. In the digital age, however, we have accepted massive data leaks as not only possible, but unavoidable.


> It just occurred to me how much easier it is to wipe everything in the cloud age than the on-prem age.

It's easier for physical facility damage to a single facility (whether hostile action or natural disaster) to wipe everything out in an on-prem setup than in the cloud, where multi-DC redundancy is a click away. But, sure, it's easier to wipe out data without physically destroying equipment in the cloud.


I think you're moving the goalposts. Gandi didn't lose all their servers and all the networking hardware and all the storage. They lost what sounds like a single replicated volume. If, y'know, all of their datacenters burned down at once, or an attacker got access and deleted their PaaS account, I think we'd all be a lot more sympathetic


My point is simply that the larger commercial cloud vendors aren't magically immune to bulk data loss, particularly in the face of internal threats.

Consider the current tensions between Iran and the US. If Iran decides to retaliate with cyberattacks, major cloud vendors could suddenly have multiple regions go up in smoke concurrently.

They'll just shrug their shoulders and say that it's the customers' responsibility to protect their own data, and that they're just offering platforms for rent.


"We have 'Data gone? Sucks to be you!' as translated by our VC's lawyer buried in our T&Cs" -- most "disruptive startups", probably...


If you have a proper disaster recovery plan then yes. All of the configuration of the entire system should be documented at least, if not generated by version controlled code. Then the only thing that needs to be backed up is actually data storage on volumes with snapshots or block storage services.


Maybe not even malicious, maybe they just put in the wrong subscription ID :(


Yup.

This thought occurred to me when I was testing a bulk resource creation script.

My workflow in my lab tenant was:

1) Bulk create hundreds of resources 2) Bulk wipe everything 3) Go to step #1

Turned out, I had some objects with globally unique names that were now conflicting in the production tenant, so I had to wipe my lab.

I had already logged on to the production tenant, and I was so "trigger happy" that I very nearly ran my bulk-erase script against the wrong subscription.

It was a terrifying moment of clarity.


Dear customer,

This mail is a follow-up to the previous email we sent (on January 8th, 2020) on this topic. As a reminder, yesterday, we experienced an incident on a storage unit at our LU-BI1 datacenter, located in Luxembourg.

Despite the replication systems in place, and the combined efforts of our technical teams throughout the night, we were unable to reover the data that was lost on the impacted storage unit.

We sincerely apologize for the inconvenience that this situation has caused. This type of incident is extremely rare in the web hosting industry.

In the event that you have a backup of your data, we suggest that you to use it to recreate your server at a different datacenter.

To help you in this, we have provided you with a promo code that will give you one free month for an instance, so that you can create a new Simple Hosting instance in a different datacenter:

    XXX


Wow, for a company that boasts "no bullshit", only offering a month after destroying data and backups seems a little tone deaf

Edit: in fairness, I'm not sure how exactly you would quantify such a loss anyway...


It sounds like they didn’t have any backups at all but rather relied on a active-active replication link to a secondary storage.

Edit: who knows it may be related to the HPE issue.

https://www.bleepingcomputer.com/news/hardware/hp-warns-that...


In other words, RAID is not backup.


What baffles me is that there seems to be no way for either the customer or a data-recovery company to flash a new firmware onto the drive after it has failed. Someone there wanted to spare the few millicents of copper trace for a JTAG port?!


Probably to prevent supply chain firmware changes for hacking, espionage, etc.


Hmm... I wonder what the "incident" was. If it involved something akin to an "rm -rf," then of course their replication link didn't protect them.


Perhaps they were depending on snapshotting and were not prepared for some kind of hardware failure taking out the entire storage system.


Reputable hosting providers typically don't try to quantify such a loss, but rather outright offer a credit/compensation that is very obviously generous (say, a year or even two of free service).

Especially when a small set of your customerbase is affected, it won't cost you that much, and "overcompensating" like that means that virtually noone is going to criticize you for quantifying it wrong; instead, the public narrative will be centered around "well, shit happens, they did their best and generously compensated".


I could understand the incident (I would _at least_ start questioning myself about the quality of the service I'm paying), but IMHO this is not something that can be addressed with a casual e-mail that contains few lines of excuses and a "promo code" like it's everyday business. That's astonishing.

Worse than a bad incident there is only bad management of the following situation.


> This type of incident is extremely rare in the web hosting industry.

Why would they include that sentence? Are they trying to imply it is rare for them because it is rare for the industry? Are they saying they are not as good as the industry, so customers should move to other providers? Or are they trying to show they apply the same inattention to their customer communication as they apply to their data backup/recovery practices?

This kind of data loss should simply never happen. It’s one thing to say “it will take us up to 30 days to restore your data because our fast recovery options aren’t working and we have to bring up cold archives”, it’s entirely another to say “your data is gone, tough”.


I'm not sure why you've been downvoted for this. I thought the same.

I read it as: "This type of incident is extremely rare in the web hosting industry, because apparently the overwhelming majority of our competitors aren't capable of fucking up as badly as we just did."

Doesn't inspire confidence at all, IMO.


> Why would they include that sentence?

They're a French company; it may be a non-native speaker not catching the implication.

It's also possibly an editing error, e.g. they started writing something like, "these types of incidents are extremely rare and when they happen etc" and most of it was dropped without considering how that changed the implication.


I think they're referring to the "incident" that they experienced (on the storage unit in the datacenter), not the situation as a whole. The implication is meant to be that they prepared for many things, but not something as unlikely as this.


I think it was meant to say "nobody is infallible", these events are extremely rare, but they /will/ occur, even if you're a customer of the best and biggest players.


If you're not paying for backups... what archive?


They say you can backup by using their snapshoting tool, but they lost those snapshots too.


The bright side is that now if anyone asks me why we would ever need the 3-2-1 backup protocol, I have a beautifully worked example.


oh damn


A promo code in exchange of your data loss. What a bargain!


“Please keep trusting us to host your data”


You really shouldn't trust anyone hosting your data. Always have backups!


Often times the backup provider is the hosting provider, whom you have to trust. (This extends all the way from big clouds like AWS and GCE to small providers like Linode and DO). Having an external backup can be unreasonably expensive due to ridiculous egress costs.


If your business can't afford external backups then you don't have a viable business in the first place. And of course egress costs have to be considered when choosing a hosting provider.


Not everything that’s hosted in the cloud is a business. In fact, the Internet wasn’t even created for the purpose of profit-generating business.


The Internet was created by the military, so yes it was.


You can still back up to the same providers' different data center. Two data centers failing simultaneously is very unlikely.


Not always an option. For instance, I use Linode’s backup service and it can only back up to the same data center (although it is said to live on a separate system).


You can, and should, back up your irreplaceable data elsewhere using a custom solution. Unless it's some service that doesn't allow you to export the data at all, it may be inconvenient, but it is an option.


Coming from a Linode employee, I can confirm this is true. Linode's backups live in the same data center as the server, but the systems are separated so that they don't directly affect one another.


Do they have separate power supplies? Have steps been taken to ensure that fire can’t spread from one room to the next? What would happen if there was an explosion?


In all seriousness, these are good points. I'm not a data center expert by any means, but here's what I know: The data center hardware has failsafes present by design, but they aren't disaster-proof being that they're in the same building.

To answer your questions: Yes, the backup storage box is in a separate chassis than the host machine that the Linode lives on; they have separate power supplies. The DCs themselves also have some sort of fire suppression. I don't know what would happen if there was an explosion.


Same data center is a single failure zone if simply because of:

1. Power delivery systems that bring power to the buildings - see issues at 111 8th Ave failures during Sandy.

2. Power systems inside the data center. Blast radius there is rather nasty. See the infamous Internap blow up around 2015(?).

3. Fire suppression/firefighting protocols.


They could mean using regular data transfer (i.e. using something like rsync instead of the provider's backup service). Maybe egress costs among servers from the same provider are reduced or nullified.

From[1]:

> Traffic over the private network does not count against your monthly quota.

I wonder how private addresses are setup by Linode.

[1] https://www.linode.com/docs/platform/billing-and-support/net...


Each data center has an internal private network with a pool of private IPs available for assignment. If a private IP is assigned to a server, it then has access to the private network.

https://www.linode.com/docs/platform/manager/remote-access/#...


This becomes very difficult as your data grows. If you live in AWS world, imagine periodic snapshotting from EBS, S3, RDS(and other data stores), EFS etc. For most people a different DC of the same cloud provider should be enough. If you have to put this into a different cloud provider it is a big cost drain and difficult to manage let alone if you want to have your own physical backups.


AWS has tools around this (lifecycle manager) that you can easily leverage for simple site backups. Or you can roll your own, honestly it is not that hard to take rolling snapshots.

Obviously hosting providers do not make it easy to extract your data because that's their vendor lock.


Also, always make sure you're testing your backups by restoring to a non-production space, and ensuring that customer services are still available.

Gandi has never explicitly said they never had their own backups, just that they don't offer backups as a service. It's entirely possible that they did have backups, but couldn't recover/restore them.


"...marginally more than rolling your own or another cloud provider."

And to "trust marginally more" simply means:

    gandi_cost_per_month + P(gandi_fails_per_month) * cost_recovery
    < 
    alt_cost_per_month + P(alt_fails_per_month) * cost_recovery


> This type of incident is extremely rare in the web hosting industry.

I read this as "so maybe you should consider one of the other web hosting companies that doesn't have problems like this."


Interesting. The public status page says they’re still waiting for the recovery process to complete.


Is this a response from the company or are you putting it forth as an example response for how to handle this incident better? It’s unclear from your post.


Looks like their backups only consisted of in-region backups on systems that were homogeneous. Common pitfall. While technically a 3-node distributed system may provide disaster recovery from one node failing, in practice, an accidental rm -rf from an ansible script targeting all three machines, or a bug in the software that's doing the replication, will leave you without a backup plan.

If you're in such a situation, The easiest is to do filesystem level backups with something like zfs and ship the backups to a third-party system that only has write/append-only semantics (better yet, use a write-once-read-many (WORM) disk to really guarantee it.).While there will still be _some_ data loss, it'll let you recover since the last snapshot.

If you don't have zfs, a database backup that runs the db dump script and scp/sftps it to a server running as a cronjob can also be an immediate remedy while you get your shit together (and by that I mean buy yourself a product with an immaculate reputation like aurora or cockroachdb to manage the db for you)

Harder but better would be to tee the log of the changestream (all distributed systems have such a log) to a third-party system. This is ideal because if it's done synchronously it'll let you recover since the last committed transaction.

And of course, test your backups, because backups are subject to code rot as well.


What backup strategy are you implying for the case of cockroachdb? Streaming the changefeed (including timestamps) to an external append-only system while slowly and incrementally iterating through all tables using as of system time to reduce impact on active transactions and know how late this shard of a "full backup" can be inserted into the "agumented" changefeed you'd generate by interleaving these shards into the changefeed. For replay you'd use the stream from the oldest shard up to the select min(a) from (select max(timestamp_resolved) as a from changefeeds group by table) newest timestamp you know you have the transactions complete changesets for (the resolved timestamp can be periodically emitted to confirm that no further records in the same feed(/table) could have a transaction timestamp earlier than it, inducing a partial ordering).

You could replay the (combined,sorted,agumented) changefeed in-order, or shard it on the table's primary key to ensure per-key monotonicity when applying the streams in parallel threads/transactions/nodes.


Gandi have something of a cult following, but in my only experience with them they literally lost my domain name during an inbound transfer.

Their response was awful and rude and completely unprofessional. I never got my domain back.

Based on that experience, this incident doesn’t surprise me at all.


I’ve always been a little confused about their cult following given their unfriendly terms — arbitrary domain cancellation based on adult material for example — which are fair terms to have if that’s their ethics but it seems at odds with the typical pro freedom expectations many people in technology hold.


They put a rude word on their homepage, that makes them edgy and cool and anti-corporation!


"No bullshit" is up there on my corpro-speak charts right along with "synergy" and "innovation".

Everyone's website says they're "no bullshit". It's all bullshit.


When my daughter was in high school she was doing an IT subject, for fun I told her to try using "synergy" in one of her assignments. She got an A, its a magic word.


"No bullshit" works when it's an SME talking, but once a company reaches a certain size then all bets are off


It was founded by pioneers of the Internet in France who where involved in non-profit/hacker/open source circles, which is where it got its cult following from.

But at the end of the day it's a cheap provider with, ahem, French-style support so I'm not sure what people were expecting out of them.


Any details on this? All I found while searching for this was Gandi explicitly advertising gTLDs designed for adult content...

Do they have that in their terms? Independently of that, do they have a history of doing that?


Why do they have a cult following? I never heard about them and reading all this here, I cannot say I understand why anyone uses them at all.

Edit; I use (and have been for a very long time) namecheap for registration and (recently) Cloudflare for DNS. I used to host all DNS myself, but that became a bit of a pain with many domains as that's definitely not my core business.


They were a freenode sponsor.


I very recently transferred a few domains to Gandi, and they also managed to lose one. I had to contact their customer support and they were able to restore it - it was all very strange. Combined with this incident and their responses on social media I'm getting the feeling that I should move them elsewhere again...


what can you recommend as an alternative?


For domain registrations I use a mix of Namecheap, Cloudflare, GoDaddy and Name.com, and haven’t had issues with any of them.

Gandi is the only domain registrar I’ve had an issue with.


I've been burned by both Namecheap and GoDaddy, along with losing a few domains in the infamous registerfly scam in the early or mid 00s. Namecheap may have been simple cock up, rather than systemic pattern of intentionally fucking over every customer. Avoid GoDaddy at all costs.

I consider GoDaddy to be one of the worst companies in existence, as bad as anyone else you can think of, as free of corruption as current ICANN and as fraudulent as registerfly. Clients looking at available domains have found them immediately registered and squatted at {hundreds}% markup. Their incompetence lost me a few domains, and several freelance clients reported similar -- all of whom were paying vastly over the odds for what they were getting. GoDaddy make Gandi look an exemplar of ideal behaviour for behaving as people are reporting in this HN post.

Their previous CEO had domain squatting and a complete lack of personal ethics as sidelines. That's quite apart from their horrific upsells making a simply renewal a 22 page nightmare of deeply dark patterned "no" clicks against atrocious value "offers".


> Namecheap, Cloudflare, GoDaddy and Name.com

Avoid GoDaddy at all costs!


GoDaddy at least is very customer support focused AFAIK.



ditto


Been using iwantmyname for the past few years. Smooth sailing all the way.


easyDNS for domains

Keeping registrar and hosting separated seems like a good idea.


In the year 2020, it's becoming increasingly impossible to trust anyone to do nearly anything (in my opinion of course).

The courts are too expensive. The culture of taking pride in one's work maybe is disappearing.

For the most crucial parts of doing business/living life, we are required to trust someone else. For example, I can't just go and make my own cell phone tower or ICANN.

And yet I can't even trust those entities to get it right.


Decreasing trust increases transaction costs.

There's got to be a measurable (negative) economic impact.


I don't have hosting with Gandi, but I do use them for domains and DNS. I'll be considering migrating my domains from them after this.

Their response to this is exceptionally poor. To say essentially "this could happen to any other web host" it nonsense. I've never had this happen with any of the providers I've used for hosting and I'd be very angry if I had just lost an entire VPS. The fact that they've lost all snapshots as well (which are advertised as backups of the underlying volume) is unforgiveable.


I had an incident similar to this with linode, which is why I use and recommend Digital Ocean nowadays.

My machine going away because you had hardware issues isn't my problem, and I'll spend my money on a more competent company.


I had the exact same experience on Digital Ocean. Attempted to resize a VPS, the process got stuck for eternity, and support tells me all data is lost.

Always have your own offsite backups.


To be clear, disk corruption can happen anywhere due to many reasons, in particular when VM disks map to local disks on an hypervisor, which gives you fast SSDs without network latency. Probably that the resize command had an issue and corrupted the image on disk. Then there's not much that can done aside from restoring from a backup. Having had backups enabled on the droplets, they would in all likelihood not have been corrupted since backups with DigitalOcean are stored offsite. In such case they could have been used to restore the droplet.

In some extreme cases, a concert of bad luck may coincides to ruin things despite multiple levels of redundancies. But that's extremely rare, especially nowadays. However DO is much larger now than it used to be, so the odds of hearing about extreme accidents increase.

disclaimer: I used to work there.


When I worked at the WordPress hosting division of Copyblogger, we always had issues like these with Digital Ocean. They would email us saying that the node had a problem, and we had to recreate the server on our own.

Good thing we only kept caching servers in Digital Ocean, so those were easily recreated, but that always kept me away from DO, personally.

In fairness to them, though, DO do not claim to keep backup of the servers, as far as I know.


DO has a service to automate backups. You can't download them or snapshots though, so you if you want a off-server copy you have to do it yourself.


I use Gandi for domains & DNS too. I've never had any problems so far but I don't want any surprises... Where do you want to migrate? What is a better alternative?


I like Cloudflare and find them to be a very good value proposition. They have a domain registrar now as well, though I haven't tried that yet. https://www.cloudflare.com/products/registrar/


Cloudflare DNS is free, and they support DNSSEC (unlike Digital Ocean). The web UI is good, and there's an API, and Terraform provider.


Most providers (notably: AWS) don't support DNSSEC, because DNSSEC doesn't matter.


I moved a couple Namecheap domains to Cloudflare's registrar when they launched, no complaints here. One domain took a bit longer to transfer, but the first took only a few minutes so I didn't mind it at all. I already used them for DNS so it felt like a no-brainer.


I use Hover for DNS and domain registration, never had a problem and their interface and support is top-notch.


Same here. Wondering what alternative there is. Heard good things about https://porkbun.com/


I'm using Porkbun and like it. I've used Namecheap, Cloudflare, Alibaba Cloud, and Gandi. I prefer Porkbun to all of them, but I'm a fan of simple, no-frills stuff.

I've used support twice when transferring domains into Porkbun and they were good. I transferred a domain out and has no issues. Their 2FA options are really good. They frequently have the best prices around (tld-list.com).


I wish porkbun allowed easier DNS record management. It's very cute, but I cant edit a bind style file, which means a lot of extra clicks.


I used to use Gandi for all my domains. I've switched to OVH though. Their DNS also propagates in like a minute.


Same boat. As much as I hate to give Jeff Bezos another penny I can't look further than AWS for everything at this stage.


Don't worry, if you purchase a domain with a .biz (or 300 other junk-tier tLDs) extension from Amazon Route 53, Gandi still gets paid[1].

[1] https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/re...


Huh. I wonder how they will react to this. Thanks for highlighting that.


FWIW, I've found GCP a pleasure to work with in comparison to AWS.


Same here, of the big three (AWS, Azure, GCP) I found GCP's panel to be the most comfortable. The recent news of unjustified Google account closures and billing mishaps are putting me off moving there, though.


I found their dashboard pretty sluggish in some parts.

I had one weird incidence with trying to host wireguard on it, I couldn't get it to work reliably even after changing MCU to suit them or trying other fixes.


I've used it and it's not near as good as AWS. Plus Google have a habit of shutting stuff down so I have basically given up on them for critical stuff.


I've been burnt several times now by smaller players claiming a higher degree of privacy that suddenly charge high fees, sell to a competitor, or sell my data. As of last month, I've moved my domains to Google. Better the devil you know than the devil you don't.


TIL that gandi was bought by a private equity firm around a year ago.[0] This may explain some things...

[0] https://news.gandi.net/en/2019/02/futureofgandi-the-adventur...


Where does it say it was bought? It talks about a new investor:

> we have found a new investor in Montefiore Investment, who have replaced our former shareholder!

Am I missing something?


Interesting.

That's a very strange blogpost.


Interesting reaction. Is the highly negative reaction correlated with US culture maybe ?

I've used them for many years and had several complex support interactions with them.

Their customer service policy is very "API-like" in that you get exactly the t&c you paid for and nothing more. Hand-holding and soothing noises are not included in the t&c. They fuck up you get a refund, you fuck up they'll tell you exactly that. Outside that they're very casual relaxed humans to communicate with.

I find that far more trustworthy (in the mathematical sense) than a "slick" twitter feed.

Politness does not imply trustworthiness.


Gandi is the absolute worst.

The last time I tried buying a domain through them, they took my money and then demanded "identification" via government ID (citing some bullshit in their ToS). I refused, so they closed my account and took the domain with them.

Based on that, I'm not surprised at all by their CEO's response to this incident[0]:

>If we led you to believe that you had nothing to do on your side when warned multiple times to make your back ups, then we'll have to make it clearer, and stop assuming that it's an industry wide knowledge.

[0]: https://twitter.com/StephanGandi/status/1215287619938062342?...


I had exactly this problem too but with NameCheap. Told them to put their id request and my money somewhere and left for Gandi.

After more than 8 years with Gandi, not had a single issue with them.


I understand people might be upset because they lost data, but as a sysadmin, my reaction is "ooh shit, poor guys, that must be a horrible week"...

And honestly, if you don't keep data of stuff you host on a server provider like this, you kind of get what you deserve...


No you don't. While agree everyone should have their own backups, you should expect your hosting company to properly replicate and backup their datacenters.


I don't, actually, expect them to do so. But even if I would, and Gandi, here, were doing backups and replications, no one is immune from errors and catastrophes.

Pretending that the cloud is permanent in infallible is extremely dangerous. I would seriously question the competence of any sysadmin relying on this as a base principle.

Sure, they screwed up, but this stuff happens. We should actually be happy it happens "only" on a "small-ish" provider like Gandi and not an entire AZ at Amazon.

Can't wait for that shoe to drop, I'll bring the popcorn, if there's anything left of civilization then...


> Gandi, here, were doing backups and replications

As far as I understand correctly they only made snapshots on the same machine, which is why there's trouble to begin with.

Considering they're currently "reminding" customers that backups are an industry standard right after losing data due to missing backups I wouldn't just shrug it off.


That's probably because they bought into the sales pitches of the likes of EMC. It's a nice pitch and in most of the cases it works exactly like EMC promises. Snapshots work great, data is always recovered, etc, etc, etc.

The fun, of course, starts that one time when it does not work and you realize that no one looked at the corner case that bit you.


Where does this come from?


That is not the industry standard for web hosting. Never has been, never will be.

Backups aren't free. Replication isn't free. DR isn't free. If a customer isn't paying a premium for them, they aren't getting them. Read the terms of service.


In this case, the customer did pay for it: https://twitter.com/andreaganduglia/status/12152083871699804...

See full thread. Snapshots are marketed as backups.


So?

Intelligent people can argue all day about whether a snapshot should be considered a backup or not, but it won't change the fact that a snapshot doesn't provide any protection from a failure in the underlying storage and it's ridiculously foolish for the owner of data to solely rely on snapshots as their backup strategy.


They literally use the word "backup." I wouldn't _normally_ expect snapshots to function as backups, but once they market them as such, I do. Yeah, sure, it's probably yet another case of a sales team getting over eager and taking over the company, but that's why if you value your ethics _at all_ you keep tabs on WTF the sales are doing.


So you're saying, against your admission of knowing better, that you can be literally swayed that a snapshot is a proper backup in the independent-of-the-original-storage sense, because their documentation equated the two?


The difference between a snapshot being a backup and not being a backup is literally the guarantees made by the provider. If the snapshot feature is documented as a backup, it is DOCUMENTED AS A BACKUP. Unless, of course, I suspect the provider of using the words as a way of confusing me, BUT THAT'S BAD. Like go read yourself a few times, you're literally defending them by claiming it's reasonable to treat them like scammers.


They can document it as anything. A backup has to be isolated; different physical location, different medium, different provider. What if the technical infrastructure works as advertised, but the company goes into receivership for whatever reason?

Having cloud provider X say they moved the bits from one place to another should not be considered a backup by anyone, regardless of what they advertise.


>snapshot doesn't provide any protection from a failure in the underlying storage

That depends on how snapshot storage is implemented by the hosting provider. They can use different storage for it, or tapes or whatever. On AWS I can easily have my snapshots on Glacier or copy them to a different data center.


How do you move your EBS snapshots to Glacier?


Use an Amazon S3 lifecycle.


Can you link any docs for that? I believe a lifecycle is attached to an S3 bucket, and there's no bucket for EBS snapshots as they're tied to EC2.


IIRC you can do this by using AWS Backup. There's a setting in the... Plan? Policy? Sorry, it's been a while and there was a weird mismatch between the terraform documentation and the official Amazon documentation... anyways, there's a setting somewhere that says to move the backup to cold storage after a certain amount of time.



I'd be interested in more on this claim.

- Was this mostly a power loss or a data loss?

- If data loss, did this affect EBS (which has had a claimed annual failure rate of 0.2% - 0.5% or so if I remember) or S3 (much lower failure rate). Remember, EBS WILL have volumes go bad - that's in the docs, they recommend snapshots, aws backup manager etc if you need higher durability.


The sysadmins over there probably have a whole list of stuff that should actually have been done, but management never gave them time to do. Then this happened and they were proven right. Their reward? Working a lot of overtime probably.


> And honestly, if you don't keep data of stuff you host on a server provider like this, you kind of get what you deserve...

While I agree that everyone should have their own off-site backups, this does come across as incredibly crass victim blaming.


At least then now we know what kind of service we may expect from Gandi... Shit happens to everyone, it is in the cleaning up you learn who you're dealing with, is my personal view on that.


> You get what you deserve

Sure, let's blame the victims here; that's effective and helpful.


culpability isn't zero-sum, everyone can have some. some entities deserve a lot, others deserve just a teeny tiny little bit.

for purposes of keeping your data safe, your cloud provider is just one, single, copy of your data. all of their redundancies and backups and whatnot are for _their_ convenience, not yours, regardless of the marketing copy.

(they can decide to intentionally delete your data because they think you didn't pay. no amount of RAID and georedundant backups on their part will help you then.)


Oh god, the victims, really? You host your data on someone else's computer to save on costs and get rid of the burden of dealing with metal and stabbing yourself with screwdrivers , and you're the victim when they fuckup?

Give me a break... It's not like anyone died here. There's a reason I host my own shit. Problems happen, errors are made, and data is lost. It's also your responsibility to deal with data permanence, even if your provider has all the promises in the world.


Well yea that’s why you pay them, to do a job. That payment comes with certain expectations and when they aren’t met you incur cost. In this case downtown and effort and time to restore from your own backup. Victim may be a bit strong but of course it’s Gandi’s fault and not their customers’.


> You host your data on someone else's computer to save on costs and get rid of the burden of dealing with metal and stabbing yourself with screwdrivers , and you're the victim when they fuckup?

A company violates their agreement with you in a way that costs you time, money, and potentially business, and you're not the victim?


Exactly, its like you somehow give your original private keys to a cloud hosting provider or a service like Gandi, they have a problem and lose your mission critical data and you later blame them for their responsibility.

They are fools on their side for failing to preserve user data, but you end up being the bigger fool for trusting them to do this for you without preserving a backup plan yourself.


Shit does happen, but pretending like it's not a big deal and not providing a solid RCA seems to be what's really annoying about their reaction.


Even if it is a bad practice not to have your own backups, no one is at fault here but Gandi


Gandi is absolutely not the company I expected this to come from.

With that having been said, everyone please stop assuming your data is safe. It’s never safe, but it’s extremely not safe single homed somewhere. Make backups. Anything that’s saved locally on one machine only? Consider it gone until it’s backed up.

Cloud providers may be able to give you better assurances, but if you really care about data give it at least 2 independent homes. I’ve lost data more than I care to admit. BuyVM lost one of my VPSes years ago. Who’s fault was it really?

When you are ready to stop kidding yourself about your data, check out some backup solutions. I particularly like Borg Backup:

https://github.com/borgbackup/borg

And if you do not have network attached storage anywhere there are services that provide it as a service.

(Note: I think needless to say it’s also a good idea to back your NAS up to other places too, although I haven’t gotten into this practice yet. Synology supposedly has a lot of features around this.)


The key question is, did Gandi offer and explicit backup service for your data on their plans? I just had a look and I don't see this being offered.

As a former hosting engineer, at the risk of pissing on everyone's outrage parade, but unless an explicit guarantee of a backup is included in your plan's contract, or you can pay for backups as a bolt-on, then if you've lost data it's your fault for not planning for this scenario.

And I mean proper backups where you get, for example, twenty eight days of hourly backups and you can pick a specific version of file to recover in that 28 period. And where those backups are stored on different hardware or off-site. We offered this as a bolt-on (in-site and off-site). Tt was 20 quid a year for in-site, the off-site was a bit more. But a great many customers chose not to pay for this add-on, even despite the great big red bold warning text explaining that unless they paid for this add-on we made no guarantees about the permanence of their data in the event of a storage problem. Guess what....

Now that's not saying we didn't take snapshots of the hosting environment, but they were for internal use and to allow us to recover quickly in the event of something unexpected going wrong, but now and again stuff breaks.

Sure, it's unfortunate some lump of storage hardware has failed and whatever mirrors they may have had have been taken out as well. They possibly could have done better but shit happens sometime.

You shouldn't rely on an "implied backup" from your service provider, if you want that then you're going to be paying a shedload more for hosting your Wordpress and Woocommerce site. It's up to you to make sure absolutely sure your data is safe if it's critical to the day-to-day running of your business.

Edit: ok, so this is tucked away in their docs (thanks to itake below):

https://docs.gandi.net/en/simple_hosting/common_operations/s...

But it does say:

> Snapshots do not make a backup of your databases. If you would like to perform a backup of your databases, we recommend you perform an export, or launch a dump script via crontab.

The bottom line...is it guaranteed in your contract? Always check. And as per my follow up comment, those plan prices are are just too cheap for that facility to be taken seriously for business continuity. They're a convenience to quickly recover a version of a file, not a serious backup.


> Easily recover backups of previous versions of your website's files, thanks to our automatic Snapshots system. It's free!

https://docs.gandi.net/en/simple_hosting/common_operations/s...

They are supposed to be providing backups.


I believe nobody should count on backups provided by the product that stores your data.

There are different kinds of backups here:

* the ones that are part of the offer, where the provider gives you a convenient way to recover from your mistakes, this is a feature they provide when their services are operational (in this case, the snapshots feature).

* the ones they put in place to mitigate incidents and maintain their SLOs. If you accidentally delete a file, you don't have access to them, they are useless to you. These backups are a mean to reach their service level objectives. Nobody can offer you 100% guarantee that they won't lose your data in an SLO. If someones promises you this, just... don't believe it.

(edit: formatting, typo, mention snapshots in case 1)


Key question, is it guaranteed in your contract?

Also:

> Snapshots do not make a backup of your databases. If you would like to perform a backup of your databases, we recommend you perform an export, or launch a dump script via crontab.

For those plan prices if I was running anything mission critical there then I'd be making darned sure I was squirting copies of my site's dynamic data to somewhere else on a regular basis (and you should also be able to re-deploy your code from local). Even if there was a guarantee, I'd still have a backstop in place. Never underestimate the chance of a good cockup.


The thing is, backup can mean different things. Those free/cheap "backup" snapshot things are obviously the equivalent of vim backup files. It's useful if you screw up and want to revert two hours later.

You have to be foolish to assume you get proper, actual backups for the price of a Coca-Cola can.


"unless they paid for this add-on we made no guarantees about the permanence of their data in the event of a storage problem"

To be honest this is not a good way to do hosting business. If you provide a service called "Simple Hosting", putting backup requirement on customer (when it is your fault) is pretty unfair.

PS: I think price of the product shouldn't effect minimum requirements.


> putting backup requirement on customer (when it is your fault) is pretty unfair.

Then they need to pay more for their "Simple" hosting.

> I think price of the product shouldn't effect minimum requirements.

See above. Sigh.


To be honest your approach looks like:

- Some airline is selling plane ticket and insurance on website. (insurance covers change of plans, rebooking etc, and even if you don't fly that flight, they are booking you another one same day)

- Then when a flight got canceled, telling customers "we rarely cancel flights, please use your insurance. (you should have bought insurance)"

PS: Simple hosting [0] I am referring seems like managed hosting.

[0] https://docs.gandi.net/en/simple_hosting/index.html


That is how it works in real life, you buy additional travel insurance to cover the things that the airline isn't contractually or legally obliged to cover themselves. You just made my point.


> you buy additional travel insurance

Do most people actually do this? I never do.

If an airline cancelled my flight, did not provide alternative arrangements, and cited some legal fine print instead... then I would be very upset. They might be legally in the right, but that wouldn't prevent me from taking my business elsewhere.


Ok, look at it this way, if they cancel your flight they can re-book you on a later flight or give you your money back. But they aren't obliged to reimburse your for the consequential losses, e.g. you missed that wedding, or stag night, Ferrari day at the track. That's what your insurance is there to cover.


I'm a long term user of Gandi for my domains but have wanted to get off them for some time now.

Can anyone recommend a domain registrar "equivalent" of a Fastmail or Letsencrypt or DNSMadeEasy i.e. truly no bullshit, geek friendly and polished at the same time ?

I'm not too bothered about price. I just want a well run outfit that has a wide selection of TLDs and ccTLDs (and ideally isn't a mega corp like google but is big enough that I don't have to worry about them disappearing overnight).


Posting back here in case it helps someone else. In the end I went with: https://dnsimple.com


NearlyFreeSpeech?


Close but sadly they don't seem to offer ccTLDs :(


easyDNS is pretty good.


Thanks. They look exactly like what I want.


+1 to this.

I've got a few bookmarked, but I haven't tried them: Porkbn, Nuage, Hover, and Namesilo.


Thanks! I'll check these out.


i have been using uniregistry.com for a while and have nothing bad to say.

in another comment on this thread, someone pointed to cloudflares new registrar offering, which also seems good.


Azure Shared Responsibilities [0]

AWS Shared Responsibilities [1]

Flipping a switch that says "Backup" does not mean you are handing your responsibility to them. At most, they will fail to meet their SLA, write you a check for according to the TOS and be done with it. At best, you'll be able to bitch about it on Twitter, possibly threaten a lawsuit (you read the ToS?) and still be in the same position because you did not share the responsibility of securing your data.

[0] https://docs.microsoft.com/en-us/azure/security/fundamentals...

[1] https://aws.amazon.com/compliance/shared-responsibility-mode...


> We sincerely apologize for the inconvenience that this situation has caused. This type of incident is extremely rare in the web hosting industry.

Why are they speaking of the "industry" as a whole when they are to blame?

It's even crazier they are not even explaining the source of the data loss and why the "replication systems" didn't help.

IHMO they are trying to sweep this event under the carpet. They should instead explain why they should be trusted in the future and why this would not occur again.


yeah, I worked for several years at a hosting provider, and I can tell you for a fact that this wouldn't have happened there.

They're 100% virtualized and keep backups of all those machines. In addition, you can purchase a package so that YOUR backups are automatically backed up to 2 different datacenters. Between the two of those solutions, there would be a way forward.

I don't really know what Gandi is so I can't speak to them directly, but this is a solvable problem.


Replication is explicitly designed to correct issues associated with bitrot and add redundancy.


Replication is not a backup as was already mentioned. A great example of this is when the KDE project almost lost all of their Git repos because they were mirroring a corrupted copy of the data. https://www.phoronix.com/scan.php?page=news_item&px=MTMzNTc


Fortunately, git is a DVCS, so anyone who checks out a repo has a complete copy of it.

Now, granted, it'd be a huge pain to track down all the people who had copies of the 1,500 different repos, and try to find as up-to-date as possible of a version of each, but I doubt they got anywhere close to potentially losing all their source code.

Incidentally this shows why it's a good idea to sync your repo to GitHub, even if the canonical repo is elsewhere: in addition to the usual reasons of incentivizing some contributors by giving them "GitHub credit", and increasing visibility of your project's code, GitHub can serve as a backup!

Also, on a side-note, 1,500 separate repositories?! That sounds way overkill. I wonder if they'd benefit from having a monorepo.


> 1,500 separate repositories?! That sounds way overkill. I wonder if they'd benefit from having a monorepo.

No it doesn't. Github has at least 20 million public repositories. Would they benefit by combining them into a monorepo?


GP is talking about the KDE project, not the entirety of GitHub.

And yes, a monorepo is usually the best approach in most cases for a project or even an entire company.


A backup is a replication of the live dataset, although, usually out of sync to be useful when the main dataset goes bad.


You might want to read the Wikipedia definition, because you're technically mistaken.

https://en.m.wikipedia.org/wiki/Backup


That's a long article; please quote the part you're referring to so we're all looking at the same text.

> a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event

Since a "replica" is a copy, that seems technically correct.


all fruits are apples because apples are fruit, right?


Your definition claims "a backup is a copy."

The original claim was "a backup is a replication of the live dataset, although, usually out of sync to be useful when the main dataset goes bad."

The only thing that makes a replica special is that it's in sync. Once you add the caveat that it's out of sync, it's just a copy.


I'm not, you can't do backup without replicating data, hence, if you do backup, you are doing replication.


It can be, it just doesn't have to be

Let's say a typical admin of a small shop wants to backup his postgres database.

The first thing he'll use is probably pg_dumpall which he'll output to a storage.

No replication involved. The backup is just a bunch of sql statements to recover the last known state of the database. It's a different kind of format however, which -by definition- isn't a replica anymore.

(And this process has several caveat's-one of which is that it can produce unusable dumps in some rare cases and isn't complete. users, triggers etc aren't dumped iirc.. could be wrong there)


> It's a different kind of format however, which -by definition- isn't a replica anymore.

All non-trivial replication has to cross machine boundaries. To transmit to another machine, you have to use a serial format since there are no pointers on the wire. So insisting that a replica must be the same format prohibits the concept of replication in practice.


So, that admin dumped his database but didn't know that the data was using a custom locale which makes recovery troublesome.

How is it a copy if it can't recover the original in some cases?

We're not talking about a compressed archive here, it's a (incomplete) step-by-step instruction to recreate the data. If anything out of norm happens, it's gonna fail-possibly silently.


> How is it a copy if it can't recover the original in some cases?

Do you think a gun is not a gun if it sometimes jams?

> We're not talking about a compressed archive here

I think we have two camps. Mine is considering "copy", "backup", "replica" to be broad categories that are distinguished by simple mathematical or technical properties. For instance, I'd consider a device that copied a single bit to be "copying," even though it's arguably just a wire.

The other camp has very specific products and tasks in mind. A replica is associated with distributed computing, while a backup is something a systems administrator makes as part of disaster recovery.


> Do you think a gun is not a gun if it sometimes jams?

but a pgdump is like a step by step guide for building the gun, leaving out a lot of the process... how can you honestly call that a gun?

but i guess we'll have to agree to disagree. which proves that it was a discussion we shouldn't have started i guess.


> (And this process has several caveat's-one of which is that it can produce unusable dumps in some rare cases and isn't complete. users, triggers etc aren't dumped iirc.. could be wrong there)

Triggers are dumped, users need pg_dumpall (as they live across multiple databases, same with tablespaces).


> No replication involved. The backup is just a bunch of sql statements to recover the last known state of the database.

You probably should look up the definition of “replication”.


The out of sync part is rather important when something accidently get deleted from the live dataset.


In the last few years, I have seen many people confuse replication with backups. People see them as the same thing, but they really aren't. Even with snapshots, if the devices are the same, they might have the same firmware bug, etc.


Just to further explore that a bit, would you say replication adds independent copies for failures of media, while backup adds copies made by independent software against failures of process / software / media.


Replication increases risk of data loss when implemented incorrectly, because added resources increase the probability of bit errors. This applies to both replicated disks (RAID) and servers. Replicated servers must use ECC memory as well as checksum blocks and periodically scrub data to ensure integrity (e.g. what ZFS does for you). If they don't then a bit error corrupts the data on all servers, because you have no way of know which copies are pristine or how to piece together pristine parts.


Replication covers durability (and availability) in face of system or media failure but does nothing whatsoever against software bugs or human error.


The "no bullshit" motto is mentioned a few times here. A motto is just another marketing device ­– a way for a company to pretend to have any sort of principles beyond making as much money as fast as possible.

Why would anyone believe a motto is anything other than a marketing device? It is only believable if people follow it contrary to pragmatism. Any company is eventually going to have a fair share of people who believe being pragmatic is more important than their motto. And in Western culture at least it's usually considered rude to bring up the "big guns" and have a fundamental values discussion when everybody just wants the meetings to end and to start making more money.


In fact a motto is usually chosen to cover a weak spot. So "No Bullshit" reads to me as "We're Kinda Cowboys". "We Care" - "People Know We Don't care".

Fujitsu – “The possibilities are infinite” ... "The Ways in Which we can Screw this Up Are Infinite"

Intel – “Leap Ahead” and “Sponsors of Tomorrow”. "We've got to protect our entrenched position".

LG – “Life's Good”. "Life is Actually Objectively Bad".

Google - "Don't be Evil". "How We Actually Make Money is Evil But Our Mission is Good".


shit happens...but the way their philosophy is fine tuned makes me wonder..

Above all, "no bullshit" is our golden rule—to treat our users how we want to be treated. It's a promise to respect your rights and to level with you about our shortcomings.

https://www.gandi.net/en-US/no-bullshit

ex: https://twitter.com/andreaganduglia/status/12151991477012316... (thanks op)

We will listen to you, and be honest in our replies, even if it means you won’t always like what we say.


> We will listen to you, and be honest in our replies, even if it means you won’t always like what we say.

They are actively treating their customers like shit, and that tone starts at the top. No bullshit does not give creative license to be assholes to people that are panicked because of something you directly caused.


That's why I keep all my DNS configuration in DNSControl and push the results to Gandi (and NameDotCom, and Route53, and GoogleDNS, and AzureDNS)

https://github.com/StackExchange/dnscontrol

(Terraform users have a similar benefit)


Sounds really cool. Do you also have NS records for all of those? or just in case you want to switch-over? (can you actually hold a domain on multiple registrars?)


I hate to say it but Gandi seems like they’re in a quality freefall. I had a domain there a year or two back because they were one of the only registrars that supported that particular extension... and man, so many problems just with simple tasks like updating the WHOIS info and credit card for renewal.

This is basic stuff.


I had a co-worker who was super chill during outages; especially at night, we were 10-15 people on the call fixing issues related to his work almost monthly.

those outages costed millions of euros, and he never picked up his phone at night, once I asked him why he never picks up, he told me:

"I used to be a general surgeon, when someone calls me people die. Relax, nobody is dying during our outages."

now I think I am taking myself(and my work) too seriously.


It's likely the only way to stay sane in a corporate environment. The problem is, you don't need much people practicing that, before it drags everyone down to the same niveau. You can choose to try to continue your quest into doing work seriously (this will likely drive you insane over the years), or to join in that kind of negligence (goodbye spine), or to quit. In the end it got us where we are now, a world filled with fake companies selling their fake little products as they were qualitative, and making a game of disrespecting their own customers. Pure facade, been there...

I've done it before, but I'll recommend to you Scott Adams' book, The Dilbert Principle for some light reading about forces like that at work.


Given that the statistical economic value of human life is around $7m, maybe he should start picking up his phone.


"I used to be a general surgeon, when someone calls me people die."

Hence he's not a surgeon anymore.


From the incident timeline:

> we have a problem to import zfs pool on the unit storage

I really want to know what went wrong to a) break ZFS b) prevent recovery from backup.


Re myself, it looks like they're using FreeBSD-based ZFS filers with iSCSI/NFS exports using a user-spce NFS server:

* https://news.gandi.net/en/2019/09/exporters-detect-micro-inc...

> Gandi’s storage infrastructure consists of two environments: one for IaaS and one for PaaS. Both are based on FreeBSD-based storage units (filers), that stock each volume (disk) as though it were a ZFS volume.

* https://news.gandi.net/en/2019/03/tracking-a-storage-issue-l...

* https://www.bsdcan.org/2016/schedule/attachments/351_FreeBSD...

No mention of what they're doing for backups / "replication systems", unsurprisingly/unfortunately. I'm anxious to know what the failure mode for `zfs send | zfs receive` replication is here?


Sounds very much like they weren't doing zfs send | zfs receive to anything sufficiently physically separated. For example, if you send and receive in the same pool, it's replication but still leaves you vulnerable to issues where the pool can't be imported due to corruption in the wrong places (it can happen) or significant hardware failure (eg a PSU fault that takes out too many of the drives in the pool).


last update says they were able to restore a version

Updated on Thursday, 9:58 PM +0200: we're not sure we will be able to provide the data but we were able to recover a version of the filesystem from right before the crash

I have been using them for DNS and some minor hosting for a long time and I will stay with them. I think it's important to avoid the monoculture/centralisation which is otherwise happening.

Sure Gandi has their flaws, they are humans.

I expect they do will a proper post-mortem on what went wrong and how they managed to fix it. Seems they were using ZFS and relied on it a bit too much. Or if they indeed managed to restore the last snapshot, then their only error might have been the classic one of underestimating how long restoring/investigating several terabytes take even on modern HW.


There have been a number of threads suggesting places like Gandi over AWS because they are so much cheaper. I've always been skeptical about building key apps on these types of places but folks INSIST it's the right choice.

3TB at Gandi costs $6 + you get compute with it. 3TB of bandwidth at AWS might be $270.

Has anyone tried this instead of using cloudfront etc? Get 100 $6 hosts and pump out content for your ipv6 connecting clients etc?


This seems like data has been lost from servers hosting sites/services.

Since Gandi is mostly known for domain registrations and DNS, I'm curious if you (as an individual who hosts websites/online services somewhere on the web) backup your site's DNS records periodically (or whenever they're changed). What if your authoritative name server lost data and all the caches of those records across geographies expire while you're asleep/away? If you do back these up regularly, how do you do it in an automated way on a *nix system? I found this article [1] when I searched about this, but it's not a simple shell script. The scripts that I did find on some of the Stackexchange sites seemed to have specific subdomain names hardcoded.

[1]: http://www.programblings.com/2012/07/23/do-you-back-up-your-...


Gandi has an API and lets you download entire zonefiles if they host your DNS.


I helped co-found a large Dropbox-like white label product. We used AWS and especially s3 for storage.

After many many years of experience with systems, I made sure we had as many possible ways to recover user data as we could. The initial solution was a large Postgres database for all the metadata/indices and s3 for the actual storage.

Despite much pushback we built in little things like an individual meta file on the file system for each file we stored. That way, if we lost the Postgres dB for any reason, we could create a script to rebuild the dB and restore access avoiding massive counts of orphaned files. A simple and probably stupid solution but...

Well guess what - the DB got corrupted and after some ado, we restored all access and none of our customers lost anything.

No it’s not full backups but...


The problem with cheap hosting is they want to use backups as an upsell, but you should still have backups to cover the companies ass even if the customer doesn’t get to use them. 123 Reg lost a load of customers VPS’s a year or two ago also thanks to a faulty script.


Not excusing them but..

with modern container hosting you really should be able to make your own. even with cheap VPS hosting. There is no reason to live in a world where a server goes down you lose anything anymore.


As a customer, I only have good things to say about Gandi.net but have to admit this is subpar customer communication right there.

Lost 3 sites built with WordPress. Will rebuild as static sites repo separate from host, no more database, lesson learned.


In the world of cloud, this should be pretty trivial. Upload your daily dumps / asset metadata to S3/GCS/ABS.

Set a retention policy I.e even if someone ran some delete command it wouldn’t delete. Someone with retention lock permissions is the only one that can remove the locks And delete.

There is cold storage and other things even cheaper. But cloud object prices are pretty cheap per GB it’s ridiculous.

I think they make most of the margins on bandwidth.

Losing customer data. All customer data is pretty ridiculous. I can understand downtime. I can understand losing a day of changes. But everything? That’s just unacceptable business.


I have a custom domain with Gandi and take advantage of their mail forwarding option to forward the emails sent to the custom domain (my “no lock-in” email address) to my personal Gmail account.

Considering how critical email is for me, seems like I won’t be trusting their MX servers to process all my inbound mail anymore and will soon be looking for another solution that works well with Gmail (don’t want to pay for GSuite), and possibly also transfer my domain to another registrar.

That support tweet is such bad taste.


Depending on your email volumes, mailgun could be a viable alternative


Thanks, I'll absolutely look into this soon.


gandi's response notwithstanding: email is hardly reliable

unless you run all the MXes (and can prove otherwise): you're likely having emails dropped all the time already


For sure, but everything is relative. Ignoring for a second the lock-in factor of using a @gmail.com address, I would trust Google's MX servers any day over Gandi's, especially after this last incident (trust == reliability in this context).


Google's MXes are notoriously strict and drop or permanently delay emails all the time for reasons beyond your control as a recipient

an example from 5 minutes ago from one of my MX'es (which only forwards, after heavy greylisting and spam filtering):

Jan 9 18:05:59 mail postfix/smtp[26197]: to=<ABC@gmail.com>, orig_to=<XYZ>, relay=alt1.gmail-smtp-in.l.google.com[209.85.233.26]:25, delay=25000, delays=25000/0.01/1.6/0.16, dsn=4.7.0, status=deferred (host alt1.gmail-smtp-in.l.google.com[209.85.233.26] said: 421-4.7.0 Our system has detected that this message is 421-4.7.0 suspicious due to the very low reputation of the sending domain. To 421-4.7.0 best protect our users from spam, the message has been blocked. 421-4.7.0 Please visit 421 4.7.0 https://support.google.com/mail/answer/188131 for more information. ABC - gsmtp (in reply to end of DATA command))

that's not my reputation (which is high), that's the reputation of the sender's From address

and it doesn't send it to the spam folder, it just delays the email forever until my MX gives up


Interesting. Would you be of the opinion that sending mail from my personal @gmail.com address would then have much higher chances of being successfully delivered to other people (most of which will inevitably be at another @gmail.com address) than those sent from my "portable" @custom.domain address?


if you're sending it inside the gmail interface to a gmail address I doubt it makes any difference

sending to outside gmail I suspect it will count against you (though major providers likely have special treatment for Google's MXes)


This is like living in an alternate universe, I've been heavily involved in all things programming and webdev for years, following trends and whatnot and it is literally the first time I'm hearing of this particular company. What is (was?) so special about them that they attracted the HN crowd can someone briefly explain? Why would I buy domain from them when something like namecheap, even google domains exists? Why would I even host something there?


> Why would I buy domain from them when something like namecheap, even google domains exists? Why would I even host something there?

If you're in Europe, they're cheap for many European countries' domains.

Back 10-15 years ago they were special because it felt like a hacker kind of company. They gave free WHOIS privacy, what seemed like good DNS control/UI at the time. But it was the WHOIS privacy that got me onto them.

I still use them because they're around half the price for .co.uk than many registrars - and many others I've used have become more rubbish than Gandi has.

All my DNS is hosted elsewhere now, and I never understood why Gandi introduced hosting et al. I've never used it and never would, it seemed a terrible diversification for a good domain registrar.


Can you explain how you not using their hosting makes it a terrible diversification?

I think majority of people buy domains for hosting websites so it makes sense they would want to setup one using one click WordPress or something similar.


It wasn't their skill-set. They had problems with their hosting from the start; reliability problems IIRC, not data loss.


I wrote out a list of things I needed in a domain registrar and once you include U2F logons and DNSSEC support, you find yourself in a very limited space.


News post about it:

https://news.gandi.net/en/2020/01/major-incident-on-our-host...

A site of mine is also hosted as their PAAS at Luxembourg, but was luckily not effected. Probably my site was on another storage unit ("on one of our ZFS storage units").

PS I also always thought that the snapshots were backups.


I have about a hundred of domains registered at gandi, I used to like the formed management interface, but I really hate the new one.

Is there a registrar you would recommend as an alternative, I don't need DNS, nameservers and glue records and I'm ok.

The main selling points are stability, transparency and simplicity. I don't care if it's not the cheapest.


Gandi is never really impressive, but they're one of the few registrars where I can get .af domains without a hassle.


All my domains are registered through Gandi. What good registrars would you suggest? I'd like to move them out.


I have some domains here, mostly secondary domains to not have all my eggs in my namecheap basket (e.g. if anything happens to namecheap or my namecheap account).

Will likely transfer those to elsewhere after this. Probably Name.com, I guess.


(1h22m before this comment)

> Updated on Thursday, 9:58 PM +0200:

> we're not sure we will be able to provide the data but we were able to recover a version of the filesystem from right before the crash

Maybe it's not all gone.


The assessment is taking a long time because there are several TB of data on the filer

Is that a lot of data? That sounds like a very small filer that could have easily been backed up.


Regardless of any of the technical aspects of this disaster, the attitude of the company and its customer service means I will be staying far away from them.


For not the first time I'm left thinking that the "big filer" model is not such a great idea :(


Always make your own backups. Shit is going to happen any time soon.


What kind of data was affected? Was Email messages affected?


'Gandi' means bad in Hindi. Some coincidence huh!


Does anyone know what Gandi is using as a “filer”?


ZFS by the looks of the status updates:

>we have a problem to import zfs pool on the unit storage. Our engineers are still working on it.


There s an upside to knowing your data is not backed up somewhere, and that when you delete them they 're really lost. They should offer that as privacy-conscious hosting.


[flagged]


Please don't.

We detached this comment from https://news.ycombinator.com/item?id=22002923 and marked it off-topic.


Hey Guys sorry for being a philistine about this but does that mean we have lost our domain name and how can we migrate it to another hosting platform? Cheers


If you only had a domain you're likely not affected (metadata). If you also have a website hosted there (data), you may be.


Just a quick question, how do I transfer my site adress to another hosting company, is that too is lost? Sorry for being a philistine about this... Cheers




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: