Route leak incident on October 2, 2014

jtchang · on Oct 2, 2014

BGP is one of those things that make the internet fragile yet resilient at the same time.

Fragile in that any person who controls an Autonomous System can advertise routes and if your neighboring AS'es are not configured to filter out stuff it will just keep propagating. Imagine a government deciding to just reroute traffic to their data center maliciously. Obviously a good chunk of the internet will go WTF and blacklist the route but not before you get a deluge of data.

In the same way it is easy to route around traffic. Congestion in the midwest? No problem. Send the traffic down to Texas. All this is built on network admins trusting each other for the most part.

feld · on Oct 2, 2014

BGP route hijacking is not uncommon, and it's used to capture traffic of your victim without them knowing.

BGPMon can tell you if this is happening to your AS

http://www.bgpmon.net/

Already__Taken · on Oct 2, 2014

> Imagine a government deciding to just reroute traffic to their data center maliciously.

I'm sure someone posted a map on here showing that this happens about once a month.

Blackthorn · on Oct 3, 2014

> All this is built on network admins trusting each other for the most part.

This is most of internet infrastructure. It was a more naive and happy time when the protocols that make up the internet were created.

lorddoig · on Oct 2, 2014

I'm a little hazy on the technical details of BGP routing - am I correct in thinking that if you're a first-class BGP citizen, you can just advertise yourself as handling others' traffic and that's it, it just comes?

pyre · on Oct 2, 2014

That's pretty much my understanding of it, though "it just comes" is a bit of a simplification.

Edit (clarification):

- BGP has no authentication so anyone can advertise 'anything' (still has to be a valid address).

- "it just comes" isn't entirely accurate as the changes propagate outwards through your peers (in p2p fashion). I'm not sure what happens technically when two networks are advertising themselves as serving a particular IP though (I have a fuzzy idea, but don't know how the edge-cases would be resolved).

lorddoig · on Oct 2, 2014

Well I imagine it would only be a portion of traffic. I wouldn't think a router in the UK would capture any US to US traffic through this method, for example. Is that what you mean?

nknighthb · on Oct 2, 2014

BGP hops are actually AS (autonomous system) numbers, which are not closely correlated with routers or distance.

If an ISP in Los Angeles transits five autonomous systems to reach an ISP in New York, but four to reach an ISP in London, and the London ISP suddenly starts advertising the New York ISP's IP addresses, oops, your packets are now headed to the UK.

Great care is taken to avoid bizarre physical routes for packets, but it still happens constantly. One DSL connection I used in Santa Clara had a propensity for routing packets through Seattle to get to Dallas for a while.

tacticus · on Oct 3, 2014

Also the smallest advertisement wins.

if an isp in india advertises all of amazons AWS ranges as /24 or even smaller the routes will go there even if the bigger ranges are closer.

pyre · on Oct 2, 2014

I added some clarification, but that's the general idea. The routers update their routing tables based on the advertisements they get from/through peers networks (as I understand it).

From the perspective of a router trying to route packets, you basically make choices based on what your peer networks are advertising that they can (directly or indirectly -- e.g. route through) service.

[Disclaimer: This is all stuff that I "know" but I've never worked directly with BGP so I'm open to correction.]

lorddoig · on Oct 2, 2014

So, in summary, when I connect to an IP the only assurance I have (putting aside the application layer e.g. HTTPS) that I'm connecting to the right box is the fact that I made it to some machine with an interface configured to that address? Oh dear.

zokier · on Oct 2, 2014

> when I connect to an IP ...

It is useful to remember that IP is connectionless protocol. The role if IP is just to shuffle packets around from node to node (where many of the nodes are not the source or final intended destination of the packets) in a fairly simplistic way.

> ... to some machine with an interface configured to that address

Technically, the other endpoint does not even need to have an interface configured with that address. You can quite easily configure a box to send replies for any packets that happen to end up to it.

> Oh dear.

There is a good reason why IPsec (etc) was invented.

lorddoig · on Oct 2, 2014

The problem with end-to-end crypto is that we often think of its security properties mathematically and neglect its practical performance. Obviously this is increasingly not true (Heartbleed probably did more to educate the world on crypto than anything else in history) but if you think of crypto as what it so often turns out to be - something waiting to be broken in semi-spectacular fashion - then I don't think it's so out of line to wish for additional assurances from complementary systems.

sp332 · on Oct 2, 2014

Yes, because your computer doesn't know anything about the destination besides its address. So it's not possible for your computer to verify anything else about the host.

Edit: I guess you could use IPSec. It's not perfect, but it's pretty cool that we could have end-to-end crypto at the IP layer. https://en.wikipedia.org/wiki/IPsec#Security_architecture

lorddoig · on Oct 2, 2014

Well yes, that's obviously true, which is why it's (idealistically speaking) important to verify the routing mechanism.

Of course most of this is mitigated by end-to-end crypto but given that we see all too frequently how fallible that can be, this topic remains of interest. I mean if crypto fails and leaks your private key (a la heartbleed) and it falls into the hands of an attacker who can hijack some BGP routes then that attacker is potentially in a very powerful position. We've seen BGP hijacking by spammers needing clean IPs in the past, so this isn't a totally implausible situation.

SEJeff · on Oct 2, 2014

Top tier providers "peer" with each other. They share routes via BPG with their routing "peers". When you know an AS only is authoritative for XYZ, you can ACL them to only be authoritative for XYZ.

http://www.dasblinkenlichten.com/bgp-route-filtering-access-... http://blog.ine.com/2008/01/08/using-extended-acls-for-bgp-f...

devicenull · on Oct 2, 2014

However, this largely doesn't happen. (If it happened everywhere, we wouldn't have issues like this)

ceph_ · on Oct 2, 2014

For the most part, yes. If you advertise a route it will be accepted. Weather or not traffic flows over the link depends on on the BGP path selection algorithm.

It's common for one AS to advertise routes for locations not within their network. It's called transit.

eastdakota · on Oct 2, 2014

Here's a post we wrote up a few years ago when a route leak happened to Google:

http://blog.cloudflare.com/why-google-went-offline-today-and...

jgrahamc · on Oct 2, 2014

In a standard situation, a transit network would announce your own routes as well as your customer routes, to your peers, transit or other customers. Peer to Transit, Transit to Transit or Peer to Peer should never be done.

lorddoig · on Oct 2, 2014

It's meant as no criticism but "should" always worries me in contexts like this. I find that often the reason it's "should" as opposed to "will" is because there's potentially dangerous human input somewhere in the process - as appears to be the case here.

It's hard to be comfortable when this is true of systems as important as those which route the internet or PKI, for example, because it's impossible to know what might happen next. Perhaps it's erroneous to take the "structure" in "infrastructure" literally but in the context of the internet that word is becoming increasingly misnomered in my mind.

MichaelGG · on Oct 2, 2014

Yep. Which also means second-class BGP citizens (customers of top-tier citizens) are one phone call or CSR away from being able to hijack routes.

ryan-c · on Oct 2, 2014

In some cases, yes is that simple.

biot · on Oct 2, 2014

  > For the time being, we have quarantined the Medellín data center
  > and disabled connectivity with Internexa.

Does this imply that it was CloudFlare's trust of Internexa's announced BGP routing which caused or contributed to the outage? Did CloudFlare redirect traffic that it should have handled internally to Internexa? If so, isn't it more appropriate for CloudFlare to prioritize its own routes over those claimed by external parties?

I would have anticipated that Internexa's routes would have affected routing by third party networks (eg: Internexa claims it handles traffic for CloudFlare's IP range, and Level3 redirects all that traffic to them) which isn't something that CloudFlare can do much about other than notify Internexa and those third parties of the problem and hope they resolve it themselves. Having not worked directly with BGP, I'm sure I'm misunderstanding something and would appreciate any additional clarification.

edit: Incidentally, this is the kind of scenario which prevented me from using CloudFlare recently. I wanted to only CNAME our production web site to CF's systems which is something they only offer with the $200/month Business plan and not with the $20/month Pro plan. Otherwise, you have to delegate ALL of your DNS for the entire domain to CloudFlare. As one user in the comments says in response to how someone could have worked around the outage:

  "That [bypassing CF] would be a good idea, except cloudflare.com
   and control panel was inaccessible during this period too, so not
   sure how this could have been done..."

I really hope they revisit their policy of not allowing Pro customers to CNAME individual sites to CloudFlare. Putting all your eggs into CloudFlare's basket limits the ability to mitigate around these kinds of issues.

BuildTheRobots · on Oct 2, 2014

>>Does this imply that it was CloudFlare's trust of Internexa's announced BGP routing which caused or contributed to the outage? Did CloudFlare redirect traffic that it should have handled internally to Internexa? If so, isn't it more appropriate for CloudFlare to prioritize its own routes over those claimed by external parties?

"This downtime was the result of a BGP route leak by Internexa, an ISP in Latin America. Internexa accidentally directed large amounts of traffic destined for CloudFlare data centers around the world to a single data center in Medellín, Colombia. This was the result of Internexa announcing via BGP that their network, instead of ours, handled traffic for CloudFlare. This miscommunication caused a flood of traffic to quickly overwhelm the data center in Medellín. The incident lasted 49 minutes, from 15:08UTC to 15:57UTC."

The problem wasn't that cloudflair thought their IPs were somewhere else, the problem was that the rest of the internet[1] thought cloudflair lived in Medellin (somewhere else) and not with cloudflair.

[1] Complete hyperbole, but "huge swathes"[2] of the internet had the wrong idea.

[2] "The exact impact of the route leak to our customers’ visitors depended on the geography of the Internet. Traffic to CloudFlare’s customers sites dropped by 50% in North America and 12% in Europe. The impact on our network in Asia was isolated to China. Traffic from South America was also affected as data centers there had to cope with an influx of traffic normally handled elsewhere."

biot · on Oct 2, 2014

I understand what you quoted (refer to what I wrote in the second paragraph). If the problem is with the rest of the internet, why did CloudFlare quarantine the Medellín datacenter? To use an analogy, if your company's upstream phone provider redirected calls to your phone number to Siberia, what good does it do for you to quarantine Siberia?

jgrahamc · on Oct 2, 2014

We quarantined it because it is connected directly to Internexa and we were unhappy with their accidental route leak.

dsl · on Oct 3, 2014

Can you explain "quarantined" a bit better? You make it sound like they are a transit provider. Did they "leak routes" you were passing to them?

eastdakota · on Oct 2, 2014

Unhappy is putting it mildly.

astrange · on Oct 3, 2014

In this case, what does quarantining it mean?

BuildTheRobots · on Oct 3, 2014

That's a good point; I'm hoping someone else can explain the definition of quarantine.

As you say, if Medellin was announcing to the world they are cloudflair, cloudflair ignoring that doesn't seem to help ;)

nknighthb · on Oct 2, 2014

There is an obvious solution, but it's an enormous undertaking.

Each level of delegation must be cryptographically validated. Route announcements not signed using a certificate to which authority for that block has been delegated must be rejected.

Someone1234 · on Oct 2, 2014

That would work but as they say the devil is in the details.

For example, are we checking CRLs & would this slow propagation (and what if the server is down), what about countries unable to get CA generated certificates due to political embargos (on either their money or businesses are just restricted from doing business with them e.g. Iran, NK, etc), it would grant the US government even more control over the internet (as now they control many root DNS servers and root CAs) which they could use for military or political purposes, and so on.

I'm not saying that the concept doesn't have merit. BGP is quite evidently deeply flawed. However this solution has so many gotchas, question marks, and complexity to it you really need to dig deep down into the proposal before knowing if it is even a good plan.

bcoates · on Oct 2, 2014

How would that work?

You could sign that you trust your next-to-last hops to deliver traffic to you but transitive delivery isn't guaranteed and not every chain of valid hops is actually workable as a route capacity-wise. A completely signature-valid route with near 100% packet loss isn't much better than an undeliverable leak.

noselasd · on Oct 2, 2014

There's an architecture for doing orignation validation in BGP, lot's of info if you want to listen to http://packetpushers.net/show-105-bgp-origin-validation-with...

However, it pretty much requires everyone to do it.

nknighthb · on Oct 2, 2014

The entirety of all possible AS paths don't need to be signed to make a difference. Even just signing the last 1-2 ASs would stop most of the big screwups.

parhamn · on Oct 2, 2014

I spent a good bit of time trying to figure out why GAuthify had brief downtime and do a post-mortem. Luckily I'm read most CF blog posts here on HN and saw this. Is Cloudflare planning on having any real-time notification system for things like this? I'm sure it'll save a lot of headache trying to figure out what happened especially if you don't read the cloudflare blog.

Its very likely that I and especially the other non-hn folks could have missed this all together (checking cloudflare for an issue is likely my last check list item due to its amazing record).

jgrahamc · on Oct 2, 2014

Yes. We are actually in the process of redoing our status system and web site to communicate problems better.

However, we are also working to ensure that we don't need to use it.

nodesocket · on Oct 2, 2014

I'm sure you guys know, but recommend https://www.statuspage.io. They allow end users to subscribe and get e-mails and SMS notifications.

jey · on Oct 2, 2014

How does something like this get fixed? Is the only option to get on the phone and yell at the guys in the other NOC to fix their announcements?

jgrahamc · on Oct 2, 2014

Yes. Without going very deep into what exactly happened here, the other NOC is the only network with control over the route announcements and so you get on the phone.

nnx · on Oct 2, 2014

Would IPv6 help reducing the impact or even completely avoid this type of incident?

iirc IPv4's "bloated" routing tables are getting harder to maintain.

jbert · on Oct 2, 2014

No, aiui, the trust model is the same. If a peer announces routes to you, all you can do is:

- throw away "martians" (routes to known-reserved/bad parts of IP space)

- throw away stuff you know they shouldn't be announcing (your stuff basically)

other than that, I think you have to trust them when they say that they have a magic route to IP 16.0.0.1 which is shorter than any other you know about. (Since it might be true).

rblatz · on Oct 2, 2014

My understanding is that it would not do anything to help prevent deliberate attacks on BGP. But I do think it may help limit the damage of accidental BGP misconfigurations since the routing tables are less confusing and crowded.

boydjd · on Oct 2, 2014

wtbob · on Oct 3, 2014

Why don't we use cryptography to guard against false routes? If each IP range were signed-for, then I think a relatively simple protocol could be used to document which networks a network could route for.