Hacker News new | past | comments | ask | show | jobs | submit login
Possible BGP hijack of 1.1.1.1 (bgpstream.com)
425 points by pstadler on May 29, 2018 | hide | past | favorite | 150 comments



I'm using AnchNet's services. And We've asked AnchNet when I recieved a e-mail from our BGPMon. They said their staff was configured a wrong config on router. Also they don't know 1.1.1.0/24 is used by CloudFlare&APNIC. So they used this prefix to test.


Why let people access BGP that don't even know that 1.0.0.0/8 or 1.1.1.0/24 are part of the public internet or that decide they can use random prefixes to "test" things? :-/


Usually upstream ISP providing transit accepts only a valid set of prefixes that they have agreed to advertise on the public internet from an ISP customer, they enforce a policy on the ingress to make this happen. Idea being, if the customer ISP ends up advertising an incorrect prefix, then the impact is only localised to his ISP and not to the whole world. But some ISPs don't follow this and implicitly trust the customer ISPs and of course there is no cover if the tier1 ISP itself typo's a prefix. There are tools such as BGP RPKI available, but its not widely deployed.


>Usually

If only...

BCP 38[0] is nowhere near usual. Lots of networks, including some very problematic big ones (cough Hurricane Electric cough), do not implement it as a matter of course. The AWS Route53 hijack last month which resulted in downtime for a number of sites plus a six figure coin theft[1] could have been prevented by adequate filtering.

0: https://tools.ietf.org/html/bcp38

1: https://arstechnica.com/information-technology/2018/04/suspi...


Could one argue for tort/negligence against the ISP who should have filtered, but didn't, if one's coins were stolen through that? Or even possibly the same, but in criminal court?


I doubt it, since the argument you're suggesting is that the ISP didn't take the best possible care, whereas the standard for negligence is, I believe (IANAL), reasonable care.

They may also not even have a duty of care in the first place, as to the truth of any metadata they're passing on. As a sibling comment pointed out, it's not as if there are laws for this.


Just as with the discussion about hackable routers yesterday, there are no laws for this.


Uhm, BCP38 is about forwarding, not the BGP control plane


To be fair 1.1.1.1 had been unassigned/non-routable up until April.


To be fair, unallocated or unassigned IP space isn't fair game to use for testing outside of an air gapped lab. I've never in my career thought it would be a good idea to "test" unallocated public unicast address space on my edge routers.


  On an airgapped lab it is bad practice. Same -though DNS related- with using *.local as a LAN TLD
We just should not.


There are instances where completely replicating an environment in an air gapped lab is acceptable. For instance, you could be encountering some edge case where a bit is being flipped in an IP address, or a bit is stuck on somewhere and replicating the environment down to the individual IP addresses plays a huge part in reproducing the issue.

An edge router is not one of these cases though.


.local is perfectly fine according to the IETF https://tools.ietf.org/html/rfc6762


For mDNS, yes. There are resolvers that don't even try DNS when looking up a name in .local.

Some of them were made more lenient because just as Microsoft finally stopped encouraging this nonsense, the Kubernetes people picked it up from ancient Windows Server folklore and apparently decided to make it web-scale.


As you linked, it is reserved for mDNS.


It was assigned to APNIC in 2010 http://seclists.org/nanog/2010/Jan/776


Allocated, not assigned.


APNIC used that terminology in their analysis paper of 1.0.0.0/8, "1.0.0.0/8 has been assigned by IANA to APNIC on the 19th January 2010" [1]. To do this analysis, wouldn't they have had to advertise a route back in 2010?

[1] http://www.potaroo.net/studies/1slash8/1slash8.html


What is the practical difference?


From RIPE, "An allocation is the block of IP addresses that is reserved by the RIPE NCC for your use now and in the future. An assignment is a block of IP addresses from your allocation that is used on an active network."

https://www.ripe.net/manage-ips-and-asns/resource-management...



It didn’t do the bad thing until after it was assigned, so no one cared when it was allocated.


It could also be an ISP leaking a loopback identifier into their routes. A lot of ISP's will give their BGP routers lo0 interfaces named things like 1.1.1.1 and 2.2.2.2. It serves as a label and not as an address.

It's been a while, but I can't count the number of times that I've seen that.


If that were the case, they would have been advertising a /32 (which, hopefully, would have been dropped -- or, the least, not redistributed) instead of a /24.


In fact, they usually use 172.10.x.x on their PtP address...


That isn't a reserved/private network either.


I shake my head in bewilderment when I see stuff like this - just why would people make things harder for themselves. I very highly doubt that they are so large that they ran out of IP space in the enormity of 172.16/12 to encompass all of their OSPF/BGP router-id /32s and individual /30 OSPF router-to-router links.


> enormity

What's enormous about an IPv4 /12? :)

When the German army requested an allocation of IPv6 address space, they were given a /28, but complained that 2^100 IPs is not enough for them and they actually need a /22.


Thats mind boggling... I'd like to read that reasoning about that! (I'm sure there's some German-language publication somewhere...)

Did they want each bullet to have a /64?


Well, bullets might move between routers, so they might change the Layer-2 network they are on, and thus might need their own independently routable address. Afaik it is bad practice to route anything smaller than a /64.


I have both a /29 and a /32 of v6 space, and unless my ASN achieves total global domination on a scale never before seen by humankind, it should last a good long time. :-)


Well it's not so enormous, but it's also accompanied by 10/8 and 192.168/16. Many networks use some combination of all three internally for different purposes.


Why give them the option to shoot themselves and neighbors foot? Seems more like a technology problem with BGP then knowledge problem.


I'm not sure when modern tech got this idea that if everyone has been using something "wrong" for decades, it's still wrong. That space has never been previously announced, it's assigned to APNIC for _research_, it's in dozens of makes and models of router as admin interfaces, blackholed or otherwise.

I get the impulse to say "you used it wrong, now it's broken", but we didn't get to a functioning worldwide internet with that attitude. We got here by observing what people were actually doing and coming to a consensus view on what to break and what to carefully tread around (you know, UX). This is an obvious example of the latter and the fact that APNIC let CF use this space for a production platform in the name of breaking shit is frankly disqualifying (in terms of their overall trustworthiness as curators of essential IN infrastructure).


I don't disagree with you that things have coalesced by consensus over a period of the past 25 years, for what IP space people should and can use, and what IP space you shouldn't use (eg: I have no doubt that a bunch of enterprise end users are using some of the US military/DoD assigned /8s internally, because those never show up on the global internet. It's wrong, but they do it anyways).

However, the RFC1918 IP ranges have existed for a very, very long time, as have the standard documentation/example IP ranges which Cisco, Juniper and others have been using in their training and example publications since 1995 or so. People have had more than twenty years to number their internal networks into the ranges that, also by consensus, the global internet community has decided to make non-globally-routable (192.168, 10.x, 172.16, etc). RFC1918 was published 22 years ago so there is really no excuse.

If you are using 1/8 in the year 2018 for your own internal production traffic, you are wrong and should feel bad. IANA and APNIC (and APNIC's contracted partner, Cloudflare) should be able to begin using ranges as granular as an individual /24 within this /8 on the public Internet without worrying that people who have misconfigured their shit will have a broken experience. It will take time for people to move their misconfigured erroneous configurations into normal RFC1918 IP space, but it will happen eventually. Or maybe not, if v6 adoption speeds up this becomes irrelevant.


> v6 adoption

Good one.


I've worked in two shops that used 1/8 space for loopback addresses for iBGP and nobody that worked there was a dummy.

It is/was not uncommon either[1]. It was never a concern since it wasn't allocated. It being a concern is a very recent phenomenon.

[1] https://www.cisco.com/c/en/us/about/press/internet-protocol-...


I'm going to say "yes they were dummies", what made them think that they were going to run out of space in 172.16/12 for router ID /32 loopbacks? If there are properly defined private IP space blocks, use those, not some random /8 you think looks nice.


>"If there are properly defined private IP space blocks, use those, not some random /8 you think looks nice."

Yeah, no. It sounds like you don't really know the history of that block. Or maybe you missed the part where I said it was used as a loopback address. Maybe both.

1.0.0.0/8 was unallocated and was also part of many peoples bogon filters at their edges.


I do know the history of that block, and have been subscribed to the relevant mail lists for bogon filters since 1998. It has never been a good idea to start using a currently unallocated /8 for internal purposes, when plenty of rfc1918 space exists. 1/8 is not the first block to ever be taken out of the bogons list and actually used. Some of the "newer" /8s that were in the last few handed out to the RIRs also had reachability issues when arin, ripe and APNIC started giving out /14 to /22 sized pieces of space to ISPs, because a number of people out there had stale bogon filters handcoded into their routers. It is mostly fixed now.


It shouldn't really matter who is currently using that net, it's not a private range :/


Tell Cisco.


Both Cisco and Juniper use the standardized documentation IP ranges in their example/lab configurations and training materials.

https://tools.ietf.org/html/rfc5735

https://tools.ietf.org/html/rfc5737


Cisco did recommend to use 1.1.1.1 (or did default to? commentary online is unclear), but in a different space: for the DHCP server with DHCP proxying in their WLAN products. They might be referring to that.


I have been going through Cisco Netacad materials for last 2 years and I've seen 1.1.1.1 in examples. Though I just did a search and did not see 1.1.1.1 mentioned anymore.


Does anyone else find it sort of beautiful watching replays of events like this? It's amazing to watch how the routers organise themselves, making and breaking connections when needed.


I must admit, what drew me to computing back in the day was the low level networking stuff.


There are at least two of us, dear friend!


Three's company.


yes, what JS library does that graph drawing and animation? Or a similar one?


Looks like they are using https://bgplayjs.com/ for that graph.


I mean, this is what BGP does.


Right, and it is pretty cool to watch is the point.


ASN 58879 belongs to Shanghai Anchang Network Security Technology Co.,Ltd (China) according to https://ipinfo.io/AS58879

website: https://www.anchnet.com/


Ah! That may have been the reason why my site wasn't resolving earlier today. It was the weirdest situation with people from all over the planet complaining without any apparent pattern, a RIPE check of the site from 10 different locations showed no issues in connectivity.

Thanks for posting this.


No, the issue persists. While I can access your site from mobile and residential connection, any static business connection fails. No 1.1.1.1 involved. I tested this with two different Fibre connections (Berlin).


Is the BGP hijack over?


As a data point - your site still isn't loading for me.

IP : 79.69.113.214

Time: Tue May 29 15:28:30 BST 2018


I wonder if it is a resolution issue or an access issue.

What happens when you go to: http://62.129.133.242/ ? That should come up with a 'domain for sale' page, that's the same server.


Not loading for me from MA, USA.

    $ httpstat http://62.129.133.242/
    2018/05/29 10:54:04 unable to connect to host 62.129.133.242:80: dial tcp 62.129.133.242:80: connect: connection timed out


+1 for httpstat; I'd not seen this utility before, quite nice!


Loads for me in Texas on a cable connection. Fixed IP.


Not loading for me.

UK : 79.69.113.214 : Tue May 29 16:06:38 BST 2018


Better now?


Just tested - loaded. Site seems to be working, have wandered about a bit.

What was the problem?


One of BT's routers was rebooted as a result of this report and that seems to have cleared up the problem.

Thank you for all your assistance in this - and also everybody else that helped to pinpoint the problem.


No problem - happy to help.


Connected to 62.129.133.242:80 from 10.44.54.246:54832

Resolves for me.


Also not loading from any SURFNet IP in The Netherlands


Should work now, the problem apparently was in some errant router.


Large companies misuse "unassigned" space all the time. I have heard engineers at my work propose using the non public routed DOD /8 before. Not on my watch!


If you happen to know why, could you explain their reasoning? In what ways are 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 insufficient?


I help design one of big 3 cloud providers and we're about to run out of private space for customer IPv4. We are addressing this is in a number of ways but I think others have run into this same issue.


And I can confirm that this is done at some Very Large ISPs. Seen it with my own eyes, e.g. 30/8


Network engineer here: I'm going to guess that this is a mistaken effort on the part of a Chinese ISP or the GFW to hijack traffic to 1.1.1.1 internally within China, but probably not intended to propagate beyond the major Chinese international-transit-ISP's connections to the global Internet. BCP38 is your friend.


AS58879 is used for AnchNet's international services, like Hong Kong, Los Angeles. Not used in China mainland.

They used AS55994 in China mainland.

In fact, China's ISP do filter via prefix and they all enable URPF. In China, IDC can't announce non-cnnic addresses


It's been a while so I should re-read it, but I thought BCP38 was best applied at the terminal ISP/client network level. If you're a transit network, you can't do that kind of fitering because you have a legitimate chance of forwarding traffic to and from any network address.


More about the principles of not just ingress filtering, but the same as BCP38, but as an ISP with its own ASN, applied to your own egress:

a) don't announce shit you don't own

b) know how to set up ACLs and prefix-list filters on your own egress IP space announcements which face towards your peers and IP transit upstreams.

Conversely, as a big ISP which has many small ASNs downstream of it, be responsible and set up filters on your own ingress which prevent your customers from announcing mistaken shit to you.

Using an example of a clueful and attentive major ISP: For example if you are a small to medium sized regional ISP and buy IP transit from NTT, one of the world's top-ten global commercial transit providers, they actually do take the time to verify each and every prefix you announce to them and will require an interaction with their NOC if you want to announce a new /22.


I doubt that this is a genuine hijacking attempt. All it takes is a Cisco router and some IT admin making up an address.


Agreed. As many pointed out when the 1.1.1.1 DNS service was introduced, it's an address that is often used (incorrectly) as an internal or temporary IP. Then all it takes is a slight mistake in your route redistribution and suddenly you can find yourself accidentally announcing the prefix to eBGP.

I wouldn't be surprised if this becomes a semi-regular occurrence.


Hanlon's razor applies to a great degree. I have no doubt that there are a great many enterprise-type organizations that have been using 1.0.0.0/8 internally and cluelessly for a very long time.


Has the hijacker previously hijacked other prefixes in the past, or is this a one-time event for them?


How effective is this? Looking at https://bgp.he.net/ip/1.1.1.1, 1.1.1.0/24 is apparently "ROA Signed and Valid". I don't know a lot about BGP. Does this mean hijacking this subnet is a bit harder than unsigned ones because some or all ISPs verify this announcement? Or is it faster/easier to detect?

Maybe a wider question: is there some way to prevent BGP hijacking?


Basically, the bigger Chinese ISPs that are upstream of this small one which is making the false 1.1.1.0/24 announcement are not actually verifying that this small ISP is allowed to announce the space.

As for prevention, the only thing that will work is proper use of IRR/route registries and RPKI validation of peer announcements. Which a great many ISPs do not currently do.

https://www.noction.com/blog/bgp-hijacking

The other method is more blunt, and can be more effective if the people with 'enable' on various ASNs' core and edge routers actually have a spine. ISPs which repeatedly announce space that they're not allocated (as per RIPE, ARIN, APNIC, AFRINIC records) should be depeered by their local peers, and their owners/operators publicly shamed. It's a reputation thing. As a neighbor of other, more clueful ISPs, it's basically the same thing as being a bad neighbor by leaving garbage all over your front lawn and causing a public nuisance with loud parties and trashy behavior.


In short, not many networks are checking signatures because not many networks are publishing them. Take a look at this presentation from 2009.

http://www.ausnog.net/sites/default/files/ausnog-03/presenta...

I've been out of the space for almost as long, so would love to be wrong, but I think its fair to say that not much has improved on this front since then.


RADb or some other RIR database registration is what my company requires. This wont really stop bad actors, however.


Interesting!

My ping to that address went terrible for a brief window today - https://i.imgur.com/KjCcBeT.png

Wonder if this was the cause.

*edit: I'm in Cape Town and the ping looks what was routing to a DC down the road decided to go to Europe instead.


Hey, what software are you using to collect the data and display that graph? I wouldn't mind running something on my server that could notify or log if 1.1.1.1 (and other services I rely on) are slow or down.


That looks like smokeping


more specifically looks like smokeping, rra file storage and rrdtool to draw the graph. Compared to modern things like influxdb+grafana it is a very oldschool setup but still works perfectly fine.


This is correct.


What does this mean for those unfamiliar?


It’s like someone posting a “Turn left for Highway 123” road sign right next to a legitimate “Turn right for Highway 123” sign. Traffic snarls result, since most drivers don’t check the DOT security sticker on the back of both signs and find it missing from the fake Left one.


In other words, some entity that is not cloudflare claimed to have the best route to (some of?) cloudflare's IP ranges, including 1.1.1.1.

If malicious, this could be someone trying to redirect 1.1.1.1 traffic elsewhere.

There have also been a lot of historical examples of misconfiguring BGP (the way big internet networks talk to each other and discuss where to send packets), such as a florida ISP accidentally claiming the best route to some major internet service and getting flooded with everyone's traffic until it died.

BGP is also really insecure (back in 2008 pakistan effectively brought down youtube for instance through BGP -- which doesnt require any authentication to claim you have the best route to X).


Significantly simplified:

BGP4, which is one of the fundamental building blocks of the global Internet, relies on trust between BGP peers. ISP A says to ISP B, their peer, "hey I'm responsible for this chunk of publicly routable IP space, please send all traffic to ASN number N for this particular block".

This works as long as everyone configures their IP space announcements and prefix-list filters correctly.

A lot of less clueful ISPs in the world do not verify the IP space announced to them by their peers (BCP38 is your friend!). This results in things like the time that a telecom in Pakistan hijacked the IP space for most of Youtube about ten years ago and successfully DDoSed themselves, while also causing a major youtube outage.

https://www.google.com/search?q=pakistan+bgp+hijack+youtube&...

This will keep happening until various ISP peers properly implement prefix-list filtering, ACLs on their edge BGP connections, and verifying peer announcements via things like various route registries.


1.1.1.1 is the primary IP for CloudFlare’s new DNS service. A BGP Hijack is when the destination route for that IP changes from it’s legit target to somewhere else. Often times it’s accidental but could be part of a wider attack.

In the case of DNS it’s particularly nasty as the attacker would control address resolution (say redirecting traffic for your bank to a phishing site) for everyone using 1.1.1.1 without more specific mitigations. Combined with long DNS cache times this could be a problem for a while.


personally im not super familiar but i found this recent article from cloudfare that gives some background information https://blog.cloudflare.com/bgp-leaks-and-crypto-currencies/


Cloudflare operates a public DNS server on 1.1.1.1 that has gotten a lot of attention since it was launched a few months ago. If a bad actor hijacks it, they can answer DNS queries with malicious answers, similar to the Amazon Route53 hijack that was used to redirect an Etherium wallet site to a fake server: https://www.internetsociety.org/blog/2018/04/amazons-route-5...


They could also simply log all the information and have made the server appeared to continue to operate as normal, since most fooling with the DNS packets would yield certificate errors for many sites (Google, YouTube, etc.) Most of the time someone comes out and says the BGP hijacking was an accident.

A bit of a Hanlon's razor situation.


1.1.1.1 is a DNS resolver that does not track activity. A BGP compromise means that someone could have compromised it and redirect/intercept traffic of those trusting it to be Cloudflare.


A BGP attack does not compromise the destination host. It reroutes (some) traffic destined for the host. Any traffic using TLS to establish destination authenticity (e.g DNS TLS, DNS over HTTP) or content authenticity (e.g. DNSSEC) would detect the attack, while other types of traffic (traditional DNS) could be exploited.


DNSSEC does not in fact mitigate BGP attacks, because to the extent it works at all, DNSSEC protects only the mapping between IP addresses and names. A BGP attacker controls the semantics of the addresses themselves, and can simply leave DNS pointing where it's supposed to, but hijack the underlying address.

TLS, on the other hand, does address this attack, because controlling all the traffic to a TLS-protected site still doesn't give you a private key that produces a valid signature on a certificate for that site.


"Any traffic using TLS to establish destination authenticity (e.g DNS TLS, DNS over HTTP)"

Well, unless you can also fool Let's Encrypt from all their locations around the world. Then you can get a Let's Encrypt certificate.


This could be a first step to compromise TLS traffic as well: https://www.princeton.edu/~pmittal/publications/bgp-tls-hotp...


Isn't the point that the attacker could compromise that promise of non-tracking? They could track whatever is routed through them and then forward them on to the legitimate destination.


Traffic meant to go to 1.1.1.1 (cloudflare DNS) could be routed elsewhere. Since this is a common DNS server, this could be used to alter domain resolution for people that use it.


I'm assuming this would/could be done by a malicious party in order to substitute different IP addresses for some sites in an attempt to direct traffic for nefarious purposes.

If my host is configured to use DNSSEC would that prevent sites from resolving?

If DNSSEC is not employed and a connection is directed to a malicious site (using https) wouldn't that prevent the connection?

(I'm afraid I'm out of my depth on the implications of this aspect of networking and wondering about the security implications for me since I'm using Cloudflare DNS servers.)


For DNSSEC: it depends, I don’t think many clients hardfail yet. For HTTPS: if you can BGP attack, theoretically you could get a TLS certificate issued.

There’s a lot of ifs on both those roads, though.


Probably a good use case to pin the certificates for your upstream DNS resolvers if you're using DNS over TLS/HTTPS.


Yeah. As of right now there’s no absolutely fool proof way to ensure all clients aren’t possibly going to slip through some crack, and I’m not sure if we’ll ever get there.


Somebody other than Cloudflare (the current holders) of the IP is receiving some of the traffic meant for it. Usuall hijack reasons are to point users to fake sites (I point yourbank.com to my own server) and phish.


could be a problem for wifi captive portals that redirect to 1.1.1.1, but most of them would never route to begin with. However, I believe the real issue is that it is a free DNS server, so someone could redirect all domains that do not use ssl pinning.


If true it means that Cloudflare's DNS server can't be trusted.


Do you mean it can't ever be trusted? Or just right now? BGP hijacking isn't that difficult to pull off.

I hope you're not trusting 8.8.8.8 either: https://twitter.com/bgpmon/status/445266642616868864


That's not correct, anyone could 'accidentally' do this to every other provider. It's not a special weakness of cloudflare.


I'm sorry, I went for brevity because someone was asking for possible impacts. If I were using 1.1.1.1 as a DNS server and saw this news story, I would change to a different DNS server until the problem was resolved.

My goal was to provide actionable information quickly.

I never said it was specific to Cloudflare. :) It is specifically a mistake which would break an assumption - that putting 1.1.1.1 into your resolver results in an answer from Cloudflare. DNS doesn't necessarily have any protections (not current, so maybe they were added?), so the only level of protection is that the IP address routes UDP traffic where we expect it to.

It also isn't a long-term problem, it only remains for the length of time the route is wrong.

It could also be argued that we're already trusting every router between the device and 1.1.1.1 anyways, so there's not much difference. Except that there's already a trust relationship between those groups, and the new route subverts them.

It's the same level of risk if someone had done a BGP hijack of any backbone router.


Okay, thanks for your polite explanation.


If you use your providers resolvers you would be very unlikely to have this problem (it's not likely someone will re-route the traffic on the network of the provider themselves).


Would this affect certificate-validating clients doing DNS-over-HTTPS to 1.1.1.1 — doesn’t it have an ipAddress certificate and demand HTTPS resolution only?


They use a named certificate, validated against the standard CAs. Unless the hijackers were able to get a certificate with the name 'cloudflare-dns.com.' then the TLS session would fail.

https://developers.cloudflare.com/1.1.1.1/dns-over-tls/


Well, if you control the host behind the IP, you could have any CA issue a challenge, and successfully pass it (e.g. if Let's encrypt uses the erroneous routes).

So no. The only thing protecting you would be to have the expected hash of the certificate you expect to see (TOFU - Trust on First use, though you're screwed if you didn't contact 1.1.1.1 before the incident!).


Doesn’t LE use their own resolvers?

EDIT: https://community.letsencrypt.org/t/where-does-letsencrypt-r...


Is there a CAA equivalent for ARIN assignments?



Thank you for pointing this out. This is exactly what I’d hoped might exist someday.


I assume you’re looking for: https://tools.ietf.org/html/bcp38


Nope, that’s not what I’m looking for at all.


RPKI, but it's barely used


This is currently used to sign ROA. A rogue actor can easily work around that by including the original AS in the AS path of the announce.


Nope, that does not cover certificate issuance for IP addresses in that range.


For dnscrypt-proxy, definitely not.

In addition to a signature of the parent cert, the DNS stamp for Cloudflare DNS says that validation must be done against dns.cloudflare.com so this would require getting a certificate for cloudflare.com.


So who is going to tell the 13 peers that they should not accept BGP path advertisements for 1.1.1.0 from anyone but Cloudflare?


I use 1.1.1.1 Do I need to do anything? Can I just continue using it or do I need to clear some cache etc?


That awkward moment when you read an IP and the first thought is "But that belongs to Cloudflare I read about this"


Are people here really using 1.1.1.1 as a DNS server...? Do people here _really_ think that Cloudflare isn't giving your data away to _someone_? I have been using DNS servers from OpenNIC for sometime now, and I will continue to.


And that is why I'm using dns over tls :)


Not enough. You have to check the certificate's fingerprint, along with its validity.

TLS is not a silver bullet. If an attacker controls the host behind what everyone believes to be 1.1.1.1, nothing is to prevent them from applying for a legit certificate.


Nothing prevents them from applying but...

* They need to do that, and get the resulting certificate, and install it, during the attack. The weirder the product (and certificates for IP addresses are relatively weird) the more humans end up involved in your order, and humans are slow.

* This leaves a smoking gun in the Certificate Transparency logs. So we all get to know (in maximum 24 hours but usually the reality will be minutes) about this extra certificate.


1) would take about 20 seconds, thanks to Let's Encrypt, but probably only marginally more time for some other CA with an API.

2) Who exactly is staring at CT logs and going "oh, I don't remember this domain using this CA, maybe I should investigate this" ? Sure there's a record of it. Doesn't really matter during an attack, because public attacks like this aren't intended to last long.

All you need is a half hour or less to steal a couple hundred million from a bank, or cryptocurrency wallet, using this attack. That's more than enough incentive for most unscrupulous 3rd world hackers.

If you're an authoritarian government, you could require CAs in your country to selectively quiet CT logs by certain users, and just issue certs willy-nilly for your private government org for MITM purposes. Google Chrome would detect them for Google-owned properties, but smaller sites would never know. And spy agencies can use this at their leisure and basically never be held accountable, because world politics.

Let's face it. The CA system is a joke and BGP is the butt of it.


"would take about 20 seconds, thanks to Let's Encrypt"

Let's Encrypt does not offer certificates for IP addresses. They choose only to offer DV certificates using methods 3.2.2.4.6 and 3.2.2.4.7

How much of your hypothetical half hour will you spend trying to figure out why your chosen ACME client reports "Policy forbids issuance" when asked for an IP address?

As to who is staring at CT logs, well there's the fun thing about the design of CT, the _logs_ aren't where you would be staring, you would be looking at a _monitor_ and a monitor can be configured to do whatever it so happens you think is important. We know that commercial CAs already sell monitoring as part of "Enterprise security" type offerings.

Facebook took... I want to say minutes here, but I can't find an exact timeline, to spot that a certificate had been issued by Let's Encrypt for a name in their DNS hierarchy and begin investigating what went wrong.

Certificate Transparency isn't finished. As it stands today your authoritarian government "only" needs to ensure that nobody notices these shenanigans as unavoidably they create a "smoking gun" which would lead to their pet CA being distrusted. That's a tall order, but certainly not impossible in the short term, or for attacks in which the bogus certs are shown to a small number of individual targets rather than a broad population that will invariably notice by accident.

But longer term CT is intended for use with a gossip protocol so that it's impossible for the pretence to be kept up. Sooner or later a node somewhere will end up realising that there's an inconsistency, either it has seen SCTs that weren't logged, or it has seen logs that aren't consistent with the logs other nodes see, either of which is a matter for distrust.

BGP has no relationship to the Web PKI, which is what I presume you're referring to by "the CA system". The relatively small number of parties interested in BGP have developed their own PKI to fit their needs.


BGP allows anyone to issue fake certs, so it does have a relationship to web PKI.

And sure, if you have a couple billion dollars, you can set up fancy infrastructure and response teams to monitor the whole web (that you are aware of, that participate in CT) for a strangely-issued cert.

Or you could just get a cert issued by the CA you normally use. In which case, now we have to track some kind of "customer number" per CA. If you use more than one CA (Google does) now you're tracking different customer numbers on different CAs. And all that has to be standardized.

Most of these "fixes" for web PKI's glaring holes are intended for large multinational corporations, or are optional, or specific to a particular browser, and don't address the main concern: _do not trust an IP address to be who it claims to be_.


No, you wouldn't use a "customer number". The usual approach taken for these genuinely concerned outfits goes like this:

1. Agree contractual terms with particular Certificate Authorities in which all certs for names in your hierarchy need explicit approval from your security people.

2. Set DNSSEC-secured CAA records for your names forbidding issuance by other Certificate Authorities.

This funnels requests from hypothetical bad guys into your security people, which is exactly what they don't want. It loses you the shiny capability to do "spur of the moment" issuance, but presumably if you want these sort of terms the phrase "spur of the moment" causes you to start writhing and clutching your throat anyway. Insider threats will usually be much _worse_ than outsiders.

As to things being "specific to a particular browser" we can't and don't want to be able to force, say, Microsoft and Apple to do things just because everybody else decided they're a good idea.

The same with the trust store programmes. I can't make Microsoft take this seriously, but Microsoft can't make me use their SChannel and associated trust store. Maybe you have a lot more pull with Microsoft than I do.

If we're in a world where the "easiest" way to get a bogus certificate is now to do global BGP hijack of an entire /24 then I think we're on the right track.


Ok ye the CA system isn't great, but as far as I'm aware, letsencrypt do DNS queries from multiple POPs, so that the route would have to propagate extremely widely to convince them to issue a cert.


A representative of LE commented on HN recently that they do not verify from multiple POPs. But even if they did, that would not stop this attack.

There are hundreds of CAs, and not all of them are going to verify from multiple POPs. You only need your attack to work on one CA for it to be effective against every client on the web.

Even if all CAs verified from multiple POPs (not likely) the attacker will just increase their attack to advertise from multiple ISPs/POPs. The attack is virtually the same, the only thing you need is more network access, which is not hard to get.


There aren't really "hundreds" of CAs in a sense that matters here.

Last time somebody insisted on this I actually counted, I forget the answer I got, but it's less than three figures.

You can get a bigger number if your idea of "effective against every client on the web" is "It works in Internet Explorer". Microsoft is very... liberal in accepting new CAs controlled by corporate or sovereign entities.

But if you expect "every client on the web" to include Chrome on Android phones, Safari on iPhones and Firefox everywhere, not just Internet Explorer, then you're talking about dozens, not hundreds, mostly because of the work done by Mozilla.

And most of those CAs are fairly small. Forget a nice API you can just make an HTTP request to and get your certificate in seconds, some of them are going to expect you to wait until business hours and talk to them on the telephone.


Thanks for that, I had no idea! I stand corrected :)


I don't enforce pinning but I do enforce checking the TLS hostname. I assume that's good enough what with certificate transparency, 100% works against a passive attack anyway?

I looked into pinning but the big services warn against it as they could change their certs at anytime, which is fair enough.

They'd have to convince a large enough percentage of the internet to accept their routes. Automated DNS services like letsencrypt make sure to take measurements from many places around the world to prevent things like this right?


A BGP attack can defeat DNS over TLS. Pinning or some other out of band is the answer to this attack.


Useless without pinning.

If you're using DNS-over-HTTPS, you should be safe though.

dnscrypt-proxy enforces pinning (the parent cert signature is included in the DNS stamp required to connect), and I guess Firefox and cloudflared also do.


Curious how is this different than the similar? issue with Amazon route 53 getting hijacked not too long ago?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: