We are not using cloud flare. But our domain is also not accessible. We are using digital ocean's DNS service for propagating our IP. Does the DigitalOcean's DNS service depend on Cloudflare service?
An SLA of 100% simply means you agree to compensate your customers (as specified, usually with credit) if your service is down at all, nothing more.
Also, SRE here but not for Cloudflare -- I've never seen SREs directly involved in externally published SLAs, they usually come from legal. We deal with SLOs on more fine grained SLIs than overall uptime
SRE - Site Reliability Engineer (a term Google came up with that's been adopted elsewhere) Google defined it approximately as what happens when you apply software engineering practices to what was traditionally an operations function.
SLO - Service Level Objective - the service level you strive for. If it's higher you have room for experimentation, etc.
SLI - Service Level Indicator - the actual metric(s) you use to measure a service level (latency, error rate, throughput, etc.)
SLA - correct. That’s the contract between the operator and the users which describes the penalties for not meeting agreed-upon SLO
SLO - service level objective, the stated availability (or latency or durability etc) of the service. Usually expressed as a value over a period of time (e.g 99.9% availability as measured over a moving 30 average). The SLO is measured by the SLI.
SLI - service level indicator. Simply, the direct measurement of the service (i.e metrics)
SRE - Site Reliability Engineer, usually a member of a team who is responsible for the continued availability of the service and the poor sap who gets paged when it breaches SLO or has an outage or other impactful event.
I'm not sure you and your parent understand what an SLA means. It's an agreement that, when broken, incurs a penalty.
They aren't saying they guarantee 100% uptime. They're saying they'll pay you for any downtime. It's literally the 3rd paragraph:
> 1.2 Penalties. If the Service fails to meet the above service level, the Customer will receive a credit equal to the result of the Service Credit calculation in Section 6 of this SLA.
(Most people I know consider them meaningless marketing BS that's really just meant to trick people or satisfy some make-work checkbox)
> Cloudflare ("Company") commits to provide a level of service for Business Customers demonstrating: [...] 100% Uptime. The Service will serve Customer Content 100% of the time without qualification.
This is a legal commitment to provide 100% uptime. They are guaranteeing 100% uptime and defining penalties for failing to meet that guarantee. The fact that a penalty is defined does not stop it from being a guarantee.
No, this SLA is a legal commitment to give you credits when Service uptime falls below a certain threshold. The threshold could be anything - 99%, 50%, 100%, etc. Importantly, Cloudflare is not under a legal obligation to provide the Service at or above the agreed threshold, it's under a legal obligation to give you Credits when the Service uptime is below that threshold.
"Service Credits are Customer’s sole and exclusive remedy for any violation of this SLA."
> This is a legal commitment to provide 100% uptime. They are guaranteeing 100% uptime
I don't think you know what a guarantee is.
For example when you buy a new car you get a guarantee that it won't break down. Are they claiming it won't break down? No, of course not. What a guarantee means is that they'll fix it or compensate you if it does.
Looks like it supports parent opinion:
commit - bind to a certain course if policy. It's legal obligation, not a statement about guarantees in physical world (like "this alloy won't melt below t°C")
I completely can understand your emotion. But even the top CDNs can have outages of some form or the other. If site uptime is important, check out https://www.cdnreserve.com/ - it's built on the design principle that the likelihood of two separate platforms having an outage at the same time is close to zero.
Cloudflare going down is one of the things which keeps me awake, My main complaint about Cloudflare is that they are very good at everything they offer that we've become reliant on them for everything.
Exactly but the likelihood of two networks going down at the same time is close to Zero. Check out: https://www.cdnreserve.com/ We rolled it out to complement top CDNs.
True, They're usually due to issues with BGP routes.
It's common to see CF being the DNS/CDN for applications across AWS, GCP, Azure etc. So perhaps CF being down affects more applications than individual cloud platforms?
Yeah, What's up with the competition to Cloudflare? What's the real barrier for entry?
It's not infrastructure anymore, As there is a new PaaS startup every week offering distributed hosting and So why bundling in DNS, DDOS detection+mitigation, cloud workers... with it is so hard?
This is just my take, but Cloudflare looks to be building a "moat" to make entry hard. This is built around two things: 1. economies of scale, 2. a network effect.
As Cloudflare gets bigger, they can provide services more cheaply. This is because (a) they can more fully utilise their data centres and other physical capital investments, (b) they can divide their fixed software costs over more users and (c) they get process efficiencies and discounts with scale.
A new entrant will struggle to match cost unless they're able to obtain similar scale. The bigger Cloudflare gets, the bigger the scale that a new entrant needs to hit before they can match them on cost.
Second they're aiming to build a network effect through having huge number of locations. The more locations, the more appealing to new customers as they can be close to more users. A competitor will have to build a similar number of locations to match Cloudflare's proposition.
A new entrant cannot provide as much value, and therefore cannot charge as high a price, without building a similar sized network. This again requires the entrant to invest heavily before they can charge a similar price.
-
The combination of these two things mean that when Cloudflare is operating at a large scale with a large network it can offer a more valuable service (and charge a higher price) than a new entrant, and earn more profit because it can operate at a lower cost.
Also, Cloudflare has the option of lowering its price and still being profitable due to lower costs at its scale, so it can deter entrants from trying to compete by the threat of being able to lower prices below what is profitable for new entrants.
The only players who can compete may be those who already have comparable size - Amazon, Google, Microsoft, Facebook, CDNs, etc, since they will already have addressed the issues of scale and network effects. However, they may not want to cannibalise their existing markets. It will be hard for other new entrants to compete.
There are many noteworthy players - Akamai, Fastly etc., and Edge plaoviders like ourselves (Zycada) who complement top CDNs like Akamai, Cloudflare, Fastly.
The main difference between Cloudflare and the others mentioned is the price; One can start with CF for a side project for free and continue to use it free till it becomes a viable startup.
Others at best offer a limited trial plan, But most are just 'Speak to expert/ Contact us' for pricing which means haggling with a sales rep while we can just build things. Even the paid plans of CF is reasonable when compared with others with better features.
You can't build a Cloudflare competitor in AWS/Azure/Linode/DO/etc. You need your own data centers. Multiple of them across the country, ideally around the world if you want to serve the whole world.
Thanks for the update, just curious if we will get a report on what happened ? In as much detail as can be of course - morbid curiosity mainly. I love the post reports these events usually bring.
Sites are gradually reappearing as I type this. Some of my sites, and doordash.com, were returning 500 errors again just a minute ago. They just came back up, followed by the CF dashboard loading again.
DR means "disaster recovery," it is a formal plan used to respond to and mitigate potential risks to the business. Things like having a communications plan for an incident, or a backup office outside of your main office natural disaster zone.
Should be back up everywhere.